subject:"\[clang\] \[clang\] fix half \&\& bfloat16 convert node expr codegen \(PR #89051\)"

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-09 Thread LLVM Continuous Integration via cfe-commits


llvm-ci wrote:

LLVM Buildbot has detected a new failure on builder `clang-hip-vega20` running 
on `hip-vega20-0` while building `clang` at step 3 "annotate".

Full details are available at: 
https://lab.llvm.org/buildbot/#/builders/123/builds/5253


Here is the relevant piece of the build log for the reference

```
Step 3 (annotate) failure: 
'../llvm-zorg/zorg/buildbot/builders/annotated/hip-build.sh --jobs=' (failure)
...
[38/40] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 
-DNDEBUG  
External/HIP/CMakeFiles/InOneWeekend-hip-6.0.2.dir/workload/ray-tracing/InOneWeekend/main.cc.o
 -o External/HIP/InOneWeekend-hip-6.0.2  
--rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt 
-unwindlib=libgcc -frtlib-add-rpath && cd 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && 
/usr/local/bin/cmake -E create_symlink 
/buildbot/llvm-test-suite/External/HIP/InOneWeekend.reference_output 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend.reference_output-hip-6.0.2
[39/40] /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG  -O3 
-DNDEBUG   -w -Werror=date-time --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 
--offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 
--offload-arch=gfx1100 -xhip -mfma -MD -MT 
External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o
 -MF 
External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d
 -o 
External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o
 -c 
/buildbot/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc
[40/40] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 
-DNDEBUG  
External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o
 -o External/HIP/TheNextWeek-hip-6.0.2  
--rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt 
-unwindlib=libgcc -frtlib-add-rpath && cd 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && 
/usr/local/bin/cmake -E create_symlink 
/buildbot/llvm-test-suite/External/HIP/TheNextWeek.reference_output 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/TheNextWeek.reference_output-hip-6.0.2
+ build_step 'Testing HIP test-suite'
+ echo '@@@BUILD_STEP Testing HIP test-suite@@@'
@@@BUILD_STEP Testing HIP test-suite@@@
+ ninja -v check-hip-simple
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP 
&& /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv 
empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test 
memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test 
blender.test
-- Testing: 7 tests, 7 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (4 of 7)
 TEST 'test-suite :: 
External/HIP/InOneWeekend-hip-6.0.2.test' FAILED 

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target 
--timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 
--limit-rss-size 838860800 --append-exitstatus --redirect-output 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out
 --redirect-input /dev/null --summary 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time
 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out
 InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target 
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out
 InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: 
Comparison failed, textual difference between 'M' and 'i'


/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 

Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 340.03s

Total Discovered Tests: 7
  Passed: 6 (85.71%)
  Failed: 1 (14.29%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && 
/buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv 
empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test 
memmove-hip-6.0.2.tes

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-09 Thread via cfe-commits


github-actions[bot] wrote:



@JinjinLi868 Congratulations on having your first Pull Request (PR) merged into 
the LLVM Project!

Your changes will be combined with recent changes from other authors, then 
tested by our [build bots](https://lab.llvm.org/buildbot/). If there is a 
problem with a build, you may receive a report in an email or a comment on this 
PR.

Please check whether problems have been caused by your change specifically, as 
the builds can include changes from many authors. It is not uncommon for your 
change to be included in a build that fails due to someone else's changes, or 
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail 
[here](https://llvm.org/docs/MyFirstTypoFix.html#myfirsttypofix-issues-after-landing-your-pr).

If your change does cause a problem, it may be reverted, or you can revert it 
yourself. This is a normal part of [LLVM 
development](https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy).
 You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are 
working as expected, well done!


https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-09 Thread Pengcheng Wang via cfe-commits


https://github.com/wangpc-pp closed 
https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-08 Thread Matt Arsenault via cfe-commits


https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-08 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From 6688d7ed8a0ce0c6885b82218e832320da57eef0 Mon Sep 17 00:00:00 2001
From: JinjinLi868 
Date: Wed, 4 Sep 2024 16:43:48 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp|  4 +++
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 25 +++
 2 files changed, 29 insertions(+)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 7aa2d3d89c2936..f315043411ae1e 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1454,6 +1454,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc");
+  }
   if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
   return Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..55451dc6f092cd
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-08 Thread via cfe-commits


JinjinLi868 wrote:

> > > The vector tests should still be added
> > 
> > 
> > sorry. if i remove the change of the vector. i have to remove the testcase. 
> > because, for the current code convert between vector type of half and 
> > bfloat16, it has a bug. And it will be Assert "Invalid cast!""
> 
> OK, LGTM with the else before return fixed. Can you handle the vector case in 
> a follow up?

OK, i will handle the vector case in a follow up.

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-06 Thread Matt Arsenault via cfe-commits


arsenm wrote:

> > The vector tests should still be added
> 
> sorry. if i remove the change of the vector. i have to remove the testcase. 
> because, for the current code convert between vector type of half and 
> bfloat16, it has a bug. And it will be Assert "Invalid cast!""
> 

OK, LGTM with the else before return fixed. Can you handle the vector case in a 
follow up? 



https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-05 Thread via cfe-commits


JinjinLi868 wrote:

i add the gcc behavior in this case 
```
__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
return (__bf16)a;
}
test_convert_from_fp16_to_bf16(_Float16):
pushrbp
mov rbp, rsp
sub rsp, 16
movdeax, xmm0
mov WORD PTR [rbp-2], ax
pinsrw  xmm0, WORD PTR [rbp-2], 0
call__trunchfbf2
movdeax, xmm0
movdxmm0, eax
leave
ret
```
```
_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
return (_Float16)a;
}
test_convert_from_bf16_to_fp16(std::bfloat16_t):
pushrbp
mov rbp, rsp
sub rsp, 16
movdeax, xmm0
mov WORD PTR [rbp-2], ax
movzx   eax, WORD PTR [rbp-2]
sal eax, 16
movdxmm0, eax
call__truncsfhf2
movdeax, xmm0
movdxmm0, eax
leave
ret
```

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-04 Thread via cfe-commits


JinjinLi868 wrote:

> The vector tests should still be added

sorry. if i remove the change of the vector. i have to remove the testcase. 
because, for the current code convert between vector type of half and bfloat16, 
it has a bug. And it will be Assert "Invalid cast!""
`CastInst *CastInst::Create(Instruction::CastOps op, Value *S, Type *Ty,
   const Twine &Name, InsertPosition InsertBefore) {
  assert(castIsValid(op, S, Ty) && "Invalid cast!");
  // Construct and return the appropriate CastInst subclass
  switch (op) {
  case Trunc: return new TruncInst (S, Ty, Name, InsertBefore);
  case ZExt:  return new ZExtInst  (S, Ty, Name, InsertBefore);
  case SExt:  return new SExtInst  (S, Ty, Name, InsertBefore);
  case FPTrunc:   return new FPTruncInst   (S, Ty, Name, InsertBefore);
  case FPExt: return new FPExtInst (S, Ty, Name, InsertBefore);

  default:
llvm_unreachable("Invalid opcode provided");
  }
}`
`CastInst::castIsValid(Instruction::CastOps op, Type *SrcTy, Type *DstTy) {
  if (!SrcTy->isFirstClassType() || !DstTy->isFirstClassType() ||
  SrcTy->isAggregateType() || DstTy->isAggregateType())
return false;

  // Get the size of the types in bits, and whether we are dealing
  // with vector types, we'll need this later.
  bool SrcIsVec = isa(SrcTy);
  bool DstIsVec = isa(DstTy);
  unsigned SrcScalarBitSize = SrcTy->getScalarSizeInBits();
  unsigned DstScalarBitSize = DstTy->getScalarSizeInBits();

  // If these are vector types, get the lengths of the vectors (using zero for
  // scalar types means that checking that vector lengths match also checks that
  // scalars are not being converted to vectors or vectors to scalars).
  ElementCount SrcEC = SrcIsVec ? cast(SrcTy)->getElementCount()
: ElementCount::getFixed(0);
  ElementCount DstEC = DstIsVec ? cast(DstTy)->getElementCount()
: ElementCount::getFixed(0);

  // Switch on the opcode provided
  switch (op) {
  default: return false; // This is an input error
  case Instruction::FPExt:
return SrcTy->isFPOrFPVectorTy() && DstTy->isFPOrFPVectorTy() &&
   SrcEC == DstEC && SrcScalarBitSize < DstScalarBitSize;
`
now, for the vector convert between half and bfloat16. it codegen to the FPExt. 
For the FPExt,  castIsValid() need SrcScalarBitSize < DstScalarBitSize;  But 
for vector half and bfloat16, the SrcScalarBitSize is equal to DstScalarBitSize.

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-04 Thread Matt Arsenault via cfe-commits


https://github.com/arsenm commented:

The vector tests should still be added 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-04 Thread Matt Arsenault via cfe-commits



@@ -1431,9 +1431,13 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "conv");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())

arsenm wrote:

Still needs to be resolved. Also should probably get a comment 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-04 Thread via cfe-commits


JinjinLi868 wrote:

> > ok, you mean, i remove the vector testcase for this patch. and just save 
> > the scalar testcase?
> 
> No, keep the tests. Only keep the scalar behavior change. The previous 
> revision was essentially correct and minimal

i have changed. just save the scalar on this patch

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-04 Thread via cfe-commits


JinjinLi868 wrote:

> > > > @JinjinLi868 are you still working on this?
> > > 
> > > 
> > > yes, i am. could you give me some advice and can i help you ?
> > 
> > 
> > Can we have a scalar only patch as @arsenm requested?
> 
> ok, you mean, i remove the vector testcase for this patch. and just save the 
> scalar testcase?

i have changed. just save the scalar on this patch

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-04 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From 3c2e873551aa85abe9011996c5adfaac544d0104 Mon Sep 17 00:00:00 2001
From: JinjinLi868 
Date: Wed, 4 Sep 2024 16:43:48 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp|  5 +++-
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 25 +++
 2 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 7aa2d3d89c2936..338b6de4693ae9 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1454,7 +1454,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
   return Builder.CreateFPExt(Src, DstTy, "conv");
 }
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..55451dc6f092cd
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-03 Thread Matt Arsenault via cfe-commits


arsenm wrote:

> ok, you mean, i remove the vector testcase for this patch. and just save the 
> scalar testcase?

No, keep the tests. Only keep the scalar behavior change. The previous revision 
was essentially correct and minimal 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-03 Thread via cfe-commits


JinjinLi868 wrote:

> > > @JinjinLi868 are you still working on this?
> > 
> > 
> > yes, i am. could you give me some advice and can i help you ?
> 
> Can we have a scalar only patch as @arsenm requested?

ok, you mean, i remove the vector testcase for this patch. and just save the 
scalar testcase?

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-03 Thread Craig Topper via cfe-commits


topperc wrote:

> > @JinjinLi868 are you still working on this?
> 
> yes, i am. could you give me some advice and can i help you ?

Can we have a scalar only patch as @arsenm requested?

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-03 Thread Craig Topper via cfe-commits



@@ -1431,9 +1431,13 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "conv");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())

topperc wrote:

Why was this resolved? It doesn't seem to have been addressed.

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-03 Thread Craig Topper via cfe-commits


topperc wrote:

> > @JinjinLi868 are you still working on this?
> 
> I can ask him. Is this PR blocking some of your recent works on float16/bf16?

I stumbled onto the verifier error earlier while writing a test. It's not 
blocking me.

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-03 Thread via cfe-commits


JinjinLi868 wrote:

> @JinjinLi868 are you still working on this?

yes, i am. could you give me some advice and can i help you ?

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-03 Thread Pengcheng Wang via cfe-commits


wangpc-pp wrote:

> @JinjinLi868 are you still working on this?

I can ask him. Is this PR blocking some of your works on float16/bf16?

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-09-03 Thread Craig Topper via cfe-commits


topperc wrote:

@JinjinLi868 are you still working on this?

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-05-03 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+

arsenm wrote:

I think these tests need to be additive. The vector behavior seems to be 
different between standard C and the proper vector languages? 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-05-03 Thread Matt Arsenault via cfe-commits


arsenm wrote:

> ping Ping Do you have another review comment?

This has now confused me. You should roll back to the case where you only 
changed the scalar behavior. Any vector behavior change should be a separate 
PR, if that is even correct. I would still like to know what the gcc behavior 
is in this case 


https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-29 Thread via cfe-commits


JinjinLi868 wrote:

ping  

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-26 Thread Pengcheng Wang via cfe-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-26 Thread Pengcheng Wang via cfe-commits



@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+

wangpc-pp wrote:

Vector tests are moved to HIP target now as the IRs are wierd in X86.

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-26 Thread Pengcheng Wang via cfe-commits



@@ -0,0 +1,165 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_half2_to_bfloat2(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x half> [[IN1]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[TMP0]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP1]]
+//
+bfloat2 test_cast_from_half2_to_bfloat2(half2 in) {
+  return (bfloat2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_half4_to_bfloat4(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8
+// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8
+// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x half> [[IN1]] to <4 x bfloat>
+// CHECK-NEXT:store <4 x bfloat> [[TMP0]], ptr [[RETVAL]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8
+// CHECK-NEXT:ret double [[TMP1]]
+//
+bfloat4 test_cast_from_half4_to_bfloat4(half4 in) {
+  return (bfloat4)in;
+}
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_bfloat2_to_half2(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x bfloat>, ptr [[IN]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x bfloat> [[IN1]] to <2 x half>
+// CHECK-NEXT:store <2 x half> [[TMP0]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP1]]
+//
+half2 test_cast_from_bfloat2_to_half2(bfloat2 in) {
+  return (half2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_bfloat4_to_half4(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:[[IN:%.*]] = alloca <4 x bfloat>, align 8
+// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8
+// CHECK-NEXT:[[IN1:%.*]] = load <4 x bfloat>, ptr [[IN]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x bfloat> [[IN1]] to <4 x half>
+// CHECK-NEXT:store <4 x half> [[TMP0]], ptr [[RETVAL]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8
+// CHECK-NEXT:ret double [[TMP1]]
+//
+half4 test_cast_from_bfloat4_to_half4(bfloat4 in) {
+  return (half4)in;
+}
+
+
+// CHECK-LABEL: define dso_local i32 @test_convertvector_from_half2_to_bfloat2(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext <2 x half> [[IN1]] to <2 x float>
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc <2 x float> [[FPEXT]] to <2 x 
bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[FPTRUNC]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load i32,

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-26 Thread Matt Arsenault via cfe-commits



@@ -1906,7 +1909,15 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) {

arsenm wrote:

This should be a separate change if it's correct 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-26 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+

arsenm wrote:

The regular C case should still test the vector conversions. Also the rebase is 
confusing me 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-26 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From 0afac9d8a6acedff53089f55eacb92a2880f58aa Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp| 15 +++-
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 25 +++
 .../test/CodeGenHIP/bfloat16-half-convert.hip | 71 +++
 3 files changed, 109 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c
 create mode 100644 clang/test/CodeGenHIP/bfloat16-half-convert.hip

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..8e35c801bc9599 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,7 +1431,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
   return Builder.CreateFPExt(Src, DstTy, "conv");
 }
@@ -1906,7 +1909,15 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) {
+  auto *ScrVecTy = cast(SrcTy);
+  Value *FloatVal = Builder.CreateFPExt(
+  Src,
+  llvm::VectorType::get(Builder.getFloatTy(),
+ScrVecTy->getElementCount()),
+  "fpext");
+  Res = Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc");
+} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..55451dc6f092cd
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
diff --git a/clang/test/CodeGenHIP/bfloat16-half-convert.hip 
b/clang/test/CodeGenHIP/bfloat16-half-convert.hip
new file mode 100644
index 00..0ffebb44c969b4
--- /dev/null
+++ b/clang/test/CodeGenHIP/bfloat16-half-convert.hip
@@ -0,0 +1,71 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -x hip -disable-O0-optnone 
-emit-llvm -fcuda-is-device \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+#define __device__ __attribute__((device))
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local noundef <2 x bfloat> 
@_Z40test_convertvector_from_half2_to_bfloat2Dv2_DF16_
+// CHECK-SAME: (<2 x half> noundef [[IN:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4, addrspace(5)
+// CHECK-NEXT:[[IN_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) 
[[IN_ADDR]] to ptr
+// CHECK-NEXT:store <2 x half> [[IN]], ptr [[IN_ADDR_ASCAST]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR_ASCAST]], 
align 4
+// CHECK-NEXT:[[FPEXT:%.*]] =

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-24 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,165 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_half2_to_bfloat2(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x half> [[IN1]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[TMP0]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP1]]
+//
+bfloat2 test_cast_from_half2_to_bfloat2(half2 in) {
+  return (bfloat2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_half4_to_bfloat4(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8
+// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8
+// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x half> [[IN1]] to <4 x bfloat>
+// CHECK-NEXT:store <4 x bfloat> [[TMP0]], ptr [[RETVAL]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8
+// CHECK-NEXT:ret double [[TMP1]]
+//
+bfloat4 test_cast_from_half4_to_bfloat4(half4 in) {
+  return (bfloat4)in;
+}
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_bfloat2_to_half2(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x bfloat>, ptr [[IN]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x bfloat> [[IN1]] to <2 x half>
+// CHECK-NEXT:store <2 x half> [[TMP0]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP1]]
+//
+half2 test_cast_from_bfloat2_to_half2(bfloat2 in) {
+  return (half2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_bfloat4_to_half4(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:[[IN:%.*]] = alloca <4 x bfloat>, align 8
+// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8
+// CHECK-NEXT:[[IN1:%.*]] = load <4 x bfloat>, ptr [[IN]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x bfloat> [[IN1]] to <4 x half>
+// CHECK-NEXT:store <4 x half> [[TMP0]], ptr [[RETVAL]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8
+// CHECK-NEXT:ret double [[TMP1]]
+//
+half4 test_cast_from_bfloat4_to_half4(bfloat4 in) {
+  return (half4)in;
+}
+
+
+// CHECK-LABEL: define dso_local i32 @test_convertvector_from_half2_to_bfloat2(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext <2 x half> [[IN1]] to <2 x float>
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc <2 x float> [[FPEXT]] to <2 x 
bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[FPTRUNC]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load i32,

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-24 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From 31ced11517042bcbd6f5f6e544cadf6943c1b1c0 Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp|  15 +-
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 165 ++
 2 files changed, 178 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..8e35c801bc9599 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,7 +1431,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
   return Builder.CreateFPExt(Src, DstTy, "conv");
 }
@@ -1906,7 +1909,15 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) {
+  auto *ScrVecTy = cast(SrcTy);
+  Value *FloatVal = Builder.CreateFPExt(
+  Src,
+  llvm::VectorType::get(Builder.getFloatTy(),
+ScrVecTy->getElementCount()),
+  "fpext");
+  Res = Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc");
+} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..3f60bd13b33d48
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,165 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN:   %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_half2_to_bfloat2(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x half> [[IN1]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[TMP0]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP1]]
+//
+bfloat2 test_cast_from_half2_to_bfloat2(half2 in) {
+  return (bfloat2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_half4_to_bfloat4(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8
+// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8
+// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8
+// CHECK-NEXT:[[TMP0:%

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-24 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From 45c6985815f7896c09c1be1eefc10cd4f9cd35af Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp|  15 +-
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 165 ++
 2 files changed, 178 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..8e35c801bc9599 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,7 +1431,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
   return Builder.CreateFPExt(Src, DstTy, "conv");
 }
@@ -1906,7 +1909,15 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) {
+  auto *ScrVecTy = cast(SrcTy);
+  Value *FloatVal = Builder.CreateFPExt(
+  Src,
+  llvm::VectorType::get(Builder.getFloatTy(),
+ScrVecTy->getElementCount()),
+  "fpext");
+  Res = Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc");
+} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..85ebe8502bc033
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,165 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone 
-emit-llvm \
+// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s
+
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half
+// CHECK-NEXT:ret half [[FPTRUNC]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float
+// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat
+// CHECK-NEXT:ret bfloat [[FPTRUNC]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_half2_to_bfloat2(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x half> [[IN1]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[TMP0]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP1]]
+//
+bfloat2 test_cast_from_half2_to_bfloat2(half2 in) {
+  return (bfloat2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_half4_to_bfloat4(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8
+// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8
+// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8
+// CHECK-NEXT:[[TMP0:%.*

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-23 Thread Matt Arsenault via cfe-commits



@@ -1431,9 +1431,13 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "conv");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())

arsenm wrote:

No else after return (although it's already violated in the existing code) 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-23 Thread Matt Arsenault via cfe-commits


https://github.com/arsenm approved this pull request.

LGTM. Would be good to verify the vector case is "correct" in as far as it's 
what GCC does 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-23 Thread Matt Arsenault via cfe-commits


https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-23 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,194 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half
+// CHECK-NEXT:ret half [[CONV1]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV1]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat>

arsenm wrote:

Does GCC have the same behavior for the bfloat x half case? 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-22 Thread Pengcheng Wang via cfe-commits



@@ -0,0 +1,194 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s

wangpc-pp wrote:

```suggestion
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-disable-O0-optnone -emit-llvm %s -o - | opt -S -passes=mem2reg | FileCheck %s
```
We can run `mem2reg` to reduce CHECKs.

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-22 Thread Eli Friedman via cfe-commits


efriedma-quic wrote:

Looks like automation didn't trigger, but:

> ⚠️ We detected that you are using a GitHub private e-mail address to 
> contribute to the repo.
> Please turn off [Keep my email addresses 
> private](https://github.com/settings/emails) setting in your account.
> See [LLVM 
> Discourse](https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it)
>  for more information.

(I'll also wait a bit to give @arsenm a chance to respond.)

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-22 Thread Eli Friedman via cfe-commits



@@ -0,0 +1,194 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half
+// CHECK-NEXT:ret half [[CONV1]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV1]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat>

efriedma-quic wrote:

The CastKind in the AST is BitCast.  This is inherited from old GNU vector_type 
semantics.  Whether or not that's correct is orthogonal to this patch.

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-22 Thread Eli Friedman via cfe-commits


https://github.com/efriedma-quic approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-22 Thread Eli Friedman via cfe-commits


https://github.com/efriedma-quic edited 
https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-22 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,194 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half
+// CHECK-NEXT:ret half [[CONV1]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV1]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[TMP1]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP2]]
+//
+bfloat2 test_cast_from_fp162_to_bf162(half2 in) {
+  return (bfloat2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_fp164_to_bf164(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8
+// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8
+// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8
+// CHECK-NEXT:store <4 x half> [[IN1]], ptr [[IN_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load <4 x half>, ptr [[IN_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <4 x half> [[TMP0]] to <4 x bfloat>
+// CHECK-NEXT:store <4 x bfloat> [[TMP1]], ptr [[RETVAL]], align 8
+// CHECK-NEXT:[[TMP2:%.*]] = load double, ptr [[RETVAL]], align 8
+// CHECK-NEXT:ret double [[TMP2]]
+//
+bfloat4 test_cast_from_fp164_to_bf164(half4 in) {
+  return (bfloat4)in;
+}
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_bf162_to_fp162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x bfloat>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x bfloat> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x bfloat>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x bfloat> [[TMP0]] to <2 x half>
+// CHECK-NEXT:store <2 x half> [[TMP1]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP2]]
+//
+half2 test_cast_from_bf162_to_fp162(bfloat2 in) {

arsenm wrote:

Rename test to match the new typedef name 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-22 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,194 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half
+// CHECK-NEXT:ret half [[CONV1]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV1]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[TMP1]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP2]]
+//
+bfloat2 test_cast_from_fp162_to_bf162(half2 in) {
+  return (bfloat2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_fp164_to_bf164(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8
+// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <4 x half>, align 8
+// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8
+// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8
+// CHECK-NEXT:store <4 x half> [[IN1]], ptr [[IN_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load <4 x half>, ptr [[IN_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <4 x half> [[TMP0]] to <4 x bfloat>
+// CHECK-NEXT:store <4 x bfloat> [[TMP1]], ptr [[RETVAL]], align 8
+// CHECK-NEXT:[[TMP2:%.*]] = load double, ptr [[RETVAL]], align 8
+// CHECK-NEXT:ret double [[TMP2]]
+//
+bfloat4 test_cast_from_fp164_to_bf164(half4 in) {
+  return (bfloat4)in;
+}
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_bf162_to_fp162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x bfloat>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x bfloat> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x bfloat>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x bfloat> [[TMP0]] to <2 x half>

arsenm wrote:

This bitcast also doesn't look right. I'm shocked that the vector cast behavior 
seems to treat FP-to-int vectors as bitcast, radically different from the 
scalar case (which OpenCL doesn't even allow). 

The comment says it's allowing bitcast between fp/int of the same size, but 
that's not really what the cast is here. 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-22 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,194 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half
+// CHECK-NEXT:ret half [[CONV1]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV1]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat>

arsenm wrote:

The vector case is still emitting the incorrect bitcast 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread via cfe-commits


JinjinLi868 wrote:

> For the vector case, I think you want __builtin_convertvector or something 
> like that

thanks, i have add  __builtin_convertvector testcase

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From f61686e42906886a0686158b3050767e60b576fa Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp|  18 +-
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 194 ++
 2 files changed, 209 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..1fbeb37de5de60 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,9 +1431,13 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv");
+return Builder.CreateFPTrunc(FloatVal, DstTy, "conv");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
-  return Builder.CreateFPExt(Src, DstTy, "conv");
+  else
+return Builder.CreateFPExt(Src, DstTy, "conv");
 }
 
 /// Emit a conversion from the specified type to the specified destination 
type,
@@ -1906,7 +1910,15 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) {
+  auto *ScrVecTy = cast(SrcTy);
+  Value *FloatVal = Builder.CreateFPExt(
+  Src,
+  llvm::VectorType::get(Builder.getFloatTy(),
+ScrVecTy->getElementCount()),
+  "conv");
+  Res = Builder.CreateFPTrunc(FloatVal, DstTy, "conv");
+} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..a1b948c873e064
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,194 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half
+// CHECK-NEXT:ret half [[CONV1]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV1]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread Eli Friedman via cfe-commits


efriedma-quic wrote:

For the vector case, I think you want __builtin_convertvector or something like 
that

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread via cfe-commits


JinjinLi868 wrote:

> > But In some target, it supply a HW instruction to complete the process 
> > (fp16->float32->bf16) . so it just supply a intrinsic (fp16 -> bf16)
> 
> Which is not a bitcast. The correct IR representation of this conversion is 
> fpext+fptrunc

i understand, i have changed. but for the vector convert case, it is also a 
bitcast IR

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From 9e6c2a16172c66b7a9eec7957d95b4239f178368 Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp|  15 ++-
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 113 ++
 2 files changed, 125 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..d4c60a2a7ffcaf 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,9 +1431,15 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) {
+Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv");
+// Value *Res = Builder.CreateFPTrunc(FloatVal, DstTy, "conv");
+// return Res;
+return Builder.CreateFPTrunc(FloatVal, DstTy, "conv");
+  } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
-  return Builder.CreateFPExt(Src, DstTy, "conv");
+  else
+return Builder.CreateFPExt(Src, DstTy, "conv");
 }
 
 /// Emit a conversion from the specified type to the specified destination 
type,
@@ -1906,7 +1912,10 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) {
+  Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv");
+  Res = Builder.CreateFPTrunc(FloatVal, DstTy, "conv");
+} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..ad12bd3f654175
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,113 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half
+// CHECK-NEXT:ret half [[CONV1]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV1]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread Matt Arsenault via cfe-commits


arsenm wrote:

> But In some target, it supply a HW instruction to complete the process 
> (fp16->float32->bf16) . so it just supply a intrinsic (fp16 -> bf16)

Which is not a bitcast. The correct IR representation of this conversion is 
fpext+fptrunc 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From a8d31ec6602f55f845b9e508f71a42f83e3a474e Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp|  11 +-
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 111 ++
 2 files changed, 119 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..da5a410f040d1b 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy()))
+return Builder.CreateBitCast(Src, DstTy, "conv");
+  else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
-  return Builder.CreateFPExt(Src, DstTy, "conv");
+  else
+return Builder.CreateFPExt(Src, DstTy, "conv");
 }
 
 /// Emit a conversion from the specified type to the specified destination 
type,
@@ -1906,7 +1909,9 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy()))
+  Res = Builder.CreateBitCast(Src, DstTy, "conv");
+else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..048789d3adecc8
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,111 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half
+// CHECK-NEXT:ret half [[CONV]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast half [[TMP0]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 half2 __attribute__((ext_vector_type(2)));
+typedef _Float16 half4 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 bfloat2 __attribute__((ext_vector_type(2)));
+typedef __bf16 bfloat4 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[TMP1]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP2]]
+//
+bfloat2 test_cast_from_fp162_to_bf162(half2 in) {
+  return (bfloat2)in;
+}
+
+
+// CHECK-LABEL: define dso_local double @test_cast_from_fp164_to_bf164(
+// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[A

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,109 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half
+// CHECK-NEXT:ret half [[CONV]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast half [[TMP0]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 _Float162 __attribute__((ext_vector_type(2)));
+typedef _Float16 _Float164 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 __bf162 __attribute__((ext_vector_type(2)));
+typedef __bf16 __bf164 __attribute__((ext_vector_type(4)));

arsenm wrote:

Can you use a different typedef name? Maybe half2/bfloat2?

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread Matt Arsenault via cfe-commits


https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread Matt Arsenault via cfe-commits


https://github.com/arsenm requested changes to this pull request.

Bitcast is not the correct behavior 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-18 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,109 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half
+// CHECK-NEXT:ret half [[CONV]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast half [[TMP0]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 _Float162 __attribute__((ext_vector_type(2)));
+typedef _Float16 _Float164 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 __bf162 __attribute__((ext_vector_type(2)));
+typedef __bf16 __bf164 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[CONV:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[CONV]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP1]]
+//
+__bf162 test_cast_from_fp162_to_bf162(_Float162 in) {
+  return __builtin_convertvector(in, __bf162);

arsenm wrote:

I meant test the raw cast, not this builtin 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From 8d0ed7653c21315c5c920114b3a2d7686e54b9ca Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc
and fpextend nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp|  11 +-
 .../test/CodeGen/X86/bfloat16-convert-half.c  | 109 ++
 2 files changed, 117 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..da5a410f040d1b 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy()))
+return Builder.CreateBitCast(Src, DstTy, "conv");
+  else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
-  return Builder.CreateFPExt(Src, DstTy, "conv");
+  else
+return Builder.CreateFPExt(Src, DstTy, "conv");
 }
 
 /// Emit a conversion from the specified type to the specified destination 
type,
@@ -1906,7 +1909,9 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy()))
+  Res = Builder.CreateBitCast(Src, DstTy, "conv");
+else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..138793c242db16
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,109 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half
+// CHECK-NEXT:ret half [[CONV]]
+//
+_Float16 test_convert_from_bf16_to_fp16(__bf16 a) {
+return (_Float16)a;
+}
+
+// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16(
+// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2
+// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast half [[TMP0]] to bfloat
+// CHECK-NEXT:ret bfloat [[CONV]]
+//
+__bf16 test_convert_from_fp16_to_bf16(_Float16 a) {
+return (__bf16)a;
+}
+
+typedef _Float16 _Float162 __attribute__((ext_vector_type(2)));
+typedef _Float16 _Float164 __attribute__((ext_vector_type(4)));
+
+typedef __bf16 __bf162 __attribute__((ext_vector_type(2)));
+typedef __bf16 __bf164 __attribute__((ext_vector_type(4)));
+
+// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162(
+// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4
+// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4
+// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4
+// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4
+// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:[[CONV:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat>
+// CHECK-NEXT:store <2 x bfloat> [[CONV]], ptr [[RETVAL]], align 4
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-NEXT:ret i32 [[TMP1]]
+//
+__bf162 test_cast_from_fp162_to_bf162(_Float162 in) {
+  return __builtin_convertvector(in, __bf162);
+}
+
+// CHECK-LABEL: define dso_local double @test_cast_from_fp164_to_bf164(
+// CHECK-SAME: dou

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread via cfe-commits

JinjinLi868 wrote:

> > This appears to just assert today, but interpreting this as bitcast doesn't 
> > make sense. I would expect this to emit a pair of casts, fpext to float, 
> > and fptrunc down to half
> 
> If we don't just reject it as an invalid cast

i understand your means, it like a Hardware behavior to do fp16 convert to 
bf16（firstly, fp16 fpext to float32, second , float32 fptrunc to bfloat16).  
But In some target, it supply a HW instruction to complete the process  
(fp16->float32->bf16) . so it just supply a intrinsic (fp16 -> bf16)

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread via cfe-commits


JinjinLi868 wrote:

> This appears to just assert today, but interpreting this as bitcast doesn't 
> make sense. I would expect this to emit a pair of casts, fpext to float, and 
> fptrunc down to half



https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread Matt Arsenault via cfe-commits


arsenm wrote:

> This appears to just assert today, but interpreting this as bitcast doesn't 
> make sense. I would expect this to emit a pair of casts, fpext to float, and 
> fptrunc down to half

If we don't just reject it as an invalid cast 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread Matt Arsenault via cfe-commits


https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread Matt Arsenault via cfe-commits



@@ -0,0 +1,14 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half
+// CHECK-NEXT:ret half [[CONV]]
+//
+_Float16 test_convert(__bf16 a) {
+return (_Float16)a;
+}

arsenm wrote:

Can you also test some vector cases? 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread Matt Arsenault via cfe-commits


https://github.com/arsenm commented:

This appears to just assert today, but interpreting this as bitcast doesn't 
make sense. I would expect this to emit a pair of casts, fpext to float, and 
fptrunc down to half 

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread via cfe-commits


https://github.com/JinjinLi868 updated 
https://github.com/llvm/llvm-project/pull/89051

>From 69a584119d8978d0ea3177c59d8772f00df3a68e Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc and fpextend 
nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp | 11 ---
 clang/test/CodeGen/X86/bfloat16-convert-half.c | 14 ++
 2 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..da5a410f040d1b 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy()))
+return Builder.CreateBitCast(Src, DstTy, "conv");
+  else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
-  return Builder.CreateFPExt(Src, DstTy, "conv");
+  else
+return Builder.CreateFPExt(Src, DstTy, "conv");
 }
 
 /// Emit a conversion from the specified type to the specified destination 
type,
@@ -1906,7 +1909,9 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy()))
+  Res = Builder.CreateBitCast(Src, DstTy, "conv");
+else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");
diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c 
b/clang/test/CodeGen/X86/bfloat16-convert-half.c
new file mode 100644
index 00..4a13e2c33c8149
--- /dev/null
+++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c
@@ -0,0 +1,14 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 
-S -emit-llvm %s -o - | FileCheck %s
+// CHECK-LABEL: define dso_local half @test_convert(
+// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2
+// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2
+// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half
+// CHECK-NEXT:ret half [[CONV]]
+//
+_Float16 test_convert(__bf16 a) {
+return (_Float16)a;
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (JinjinLi868)


Changes

Data type conversion between fp16 and bf16 will generate fptrunc and fpextend 
nodes, but they are actually bitcast nodes.


---
Full diff: https://github.com/llvm/llvm-project/pull/89051.diff


1 Files Affected:

- (modified) clang/lib/CodeGen/CGExprScalar.cpp (+8-3) 


``diff
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..da5a410f040d1b 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy()))
+return Builder.CreateBitCast(Src, DstTy, "conv");
+  else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
-  return Builder.CreateFPExt(Src, DstTy, "conv");
+  else
+return Builder.CreateFPExt(Src, DstTy, "conv");
 }
 
 /// Emit a conversion from the specified type to the specified destination 
type,
@@ -1906,7 +1909,9 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy()))
+  Res = Builder.CreateBitCast(Src, DstTy, "conv");
+else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");

``




https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread via cfe-commits


github-actions[bot] wrote:



Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this 
page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using `@` followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from 
other developers.

If you have further questions, they may be answered by the [LLVM GitHub User 
Guide](https://llvm.org/docs/GitHub.html).

You can also ask questions in a comment on this PR, on the [LLVM 
Discord](https://discord.com/invite/xS7Z362) or on the 
[forums](https://discourse.llvm.org/).

https://github.com/llvm/llvm-project/pull/89051
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)

2024-04-17 Thread via cfe-commits


https://github.com/JinjinLi868 created 
https://github.com/llvm/llvm-project/pull/89051

Data type conversion between fp16 and bf16 will generate fptrunc and fpextend 
nodes, but they are actually bitcast nodes.


>From 02c11a9db49dd34839feb8329cfb8dbe4bc45763 Mon Sep 17 00:00:00 2001
From: Jinjin Li 
Date: Wed, 17 Apr 2024 16:44:50 +0800
Subject: [PATCH] [clang] fix half && bfloat16 convert node expr codegen

Data type conversion between fp16 and bf16 will generate fptrunc and fpextend 
nodes, but they are actually bitcast nodes.
---
 clang/lib/CodeGen/CGExprScalar.cpp | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 1f18e0d5ba409a..da5a410f040d1b 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }
 
-  if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
+  if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy()))
+return Builder.CreateBitCast(Src, DstTy, "conv");
+  else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID())
 return Builder.CreateFPTrunc(Src, DstTy, "conv");
-  return Builder.CreateFPExt(Src, DstTy, "conv");
+  else
+return Builder.CreateFPExt(Src, DstTy, "conv");
 }
 
 /// Emit a conversion from the specified type to the specified destination 
type,
@@ -1906,7 +1909,9 @@ Value 
*ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) {
   } else {
 assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() &&
"Unknown real conversion");
-if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
+if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy()))
+  Res = Builder.CreateBitCast(Src, DstTy, "conv");
+else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID())
   Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
 else
   Res = Builder.CreateFPExt(Src, DstTy, "conv");

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

68 matches

Mail list logo