[PATCH] D79710: [clang][BFloat] Add create/set/get/dup intrinsics

2020-06-05 Thread Ties Stuij via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8b137a430636: [clang][BFloat] Add create/set/get/dup 
intrinsics (authored by stuij).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
  clang/test/CodeGen/arm-bf16-getset-intrinsics.c

Index: clang/test/CodeGen/arm-bf16-getset-intrinsics.c
===
--- /dev/null
+++ clang/test/CodeGen/arm-bf16-getset-intrinsics.c
@@ -0,0 +1,151 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple armv8.6a-arm-none-eabi -target-feature +neon -target-feature +bf16 -mfloat-abi hard \
+// RUN:  -disable-O0-optnone -emit-llvm %s -o - | opt -S -mem2reg -instcombine | FileCheck %s
+
+#include 
+
+// CHECK-LABEL: @test_vcreate_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast i64 [[A:%.*]] to <4 x bfloat>
+// CHECK-NEXT:ret <4 x bfloat> [[TMP0]]
+//
+bfloat16x4_t test_vcreate_bf16(uint64_t a) {
+  return vcreate_bf16(a);
+}
+
+// CHECK-LABEL: @test_vdup_n_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <4 x bfloat> undef, bfloat [[V:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT3_I:%.*]] = shufflevector <4 x bfloat> [[VECINIT_I]], <4 x bfloat> undef, <4 x i32> zeroinitializer
+// CHECK-NEXT:ret <4 x bfloat> [[VECINIT3_I]]
+//
+bfloat16x4_t test_vdup_n_bf16(bfloat16_t v) {
+  return vdup_n_bf16(v);
+}
+
+// CHECK-LABEL: @test_vdupq_n_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <8 x bfloat> undef, bfloat [[V:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT7_I:%.*]] = shufflevector <8 x bfloat> [[VECINIT_I]], <8 x bfloat> undef, <8 x i32> zeroinitializer
+// CHECK-NEXT:ret <8 x bfloat> [[VECINIT7_I]]
+//
+bfloat16x8_t test_vdupq_n_bf16(bfloat16_t v) {
+  return vdupq_n_bf16(v);
+}
+
+// CHECK-LABEL: @test_vdup_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <4 x bfloat> [[V:%.*]], <4 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[LANE]]
+//
+bfloat16x4_t test_vdup_lane_bf16(bfloat16x4_t v) {
+  return vdup_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vdupq_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <4 x bfloat> [[V:%.*]], <4 x bfloat> undef, <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[LANE]]
+//
+bfloat16x8_t test_vdupq_lane_bf16(bfloat16x4_t v) {
+  return vdupq_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vdup_laneq_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <8 x bfloat> [[V:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[LANE]]
+//
+bfloat16x4_t test_vdup_laneq_bf16(bfloat16x8_t v) {
+  return vdup_laneq_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vdupq_laneq_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <8 x bfloat> [[V:%.*]], <8 x bfloat> undef, <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[LANE]]
+//
+bfloat16x8_t test_vdupq_laneq_bf16(bfloat16x8_t v) {
+  return vdupq_laneq_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vcombine_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <4 x bfloat> [[LOW:%.*]], <4 x bfloat> [[HIGH:%.*]], <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x8_t test_vcombine_bf16(bfloat16x4_t low, bfloat16x4_t high) {
+  return vcombine_bf16(low, high);
+}
+
+// CHECK-LABEL: @test_vget_high_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <8 x bfloat> [[A:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x4_t test_vget_high_bf16(bfloat16x8_t a) {
+  return vget_high_bf16(a);
+}
+
+// CHECK-LABEL: @test_vget_low_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <8 x bfloat> [[A:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x4_t test_vget_low_bf16(bfloat16x8_t a) {
+  return vget_low_bf16(a);
+}
+
+// CHECK-LABEL: @test_vget_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DOTCAST1:%.*]] = extractelement <4 x bfloat> [[V:%.*]], i32 1
+// CHECK-NEXT:ret bfloat [[DOTCAST1]]
+//
+bfloat16_t test_vget_lane_bf16(bfloat16x4_t v) {
+  return vget_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vgetq_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DOTCAST1:%.*]] = extractelement <8 x bfloat> [[V:%.*]], i32 7
+// CHECK-NEXT:ret bfloat [[DOTCAST1]]
+//
+bfloat16_t test_vgetq_lane_bf16(bfloat16x8_t v) {
+  return vgetq_lane_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vset_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x bfloat> 

[PATCH] D79710: [clang][BFloat] Add create/set/get/dup intrinsics

2020-06-04 Thread Mikhail Maltsev via Phabricator via cfe-commits
miyuki updated this revision to Diff 268528.
miyuki added a comment.

Fixed the "RUN:" line in the A32 test, implemented vduph_lane for A32.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
  clang/test/CodeGen/arm-bf16-getset-intrinsics.c

Index: clang/test/CodeGen/arm-bf16-getset-intrinsics.c
===
--- /dev/null
+++ clang/test/CodeGen/arm-bf16-getset-intrinsics.c
@@ -0,0 +1,151 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple armv8.6a-arm-none-eabi -target-feature +neon -target-feature +bf16 -mfloat-abi hard \
+// RUN:  -disable-O0-optnone -emit-llvm %s -o - | opt -S -mem2reg -instcombine | FileCheck %s
+
+#include 
+
+// CHECK-LABEL: @test_vcreate_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast i64 [[A:%.*]] to <4 x bfloat>
+// CHECK-NEXT:ret <4 x bfloat> [[TMP0]]
+//
+bfloat16x4_t test_vcreate_bf16(uint64_t a) {
+  return vcreate_bf16(a);
+}
+
+// CHECK-LABEL: @test_vdup_n_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <4 x bfloat> undef, bfloat [[V:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT3_I:%.*]] = shufflevector <4 x bfloat> [[VECINIT_I]], <4 x bfloat> undef, <4 x i32> zeroinitializer
+// CHECK-NEXT:ret <4 x bfloat> [[VECINIT3_I]]
+//
+bfloat16x4_t test_vdup_n_bf16(bfloat16_t v) {
+  return vdup_n_bf16(v);
+}
+
+// CHECK-LABEL: @test_vdupq_n_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <8 x bfloat> undef, bfloat [[V:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT7_I:%.*]] = shufflevector <8 x bfloat> [[VECINIT_I]], <8 x bfloat> undef, <8 x i32> zeroinitializer
+// CHECK-NEXT:ret <8 x bfloat> [[VECINIT7_I]]
+//
+bfloat16x8_t test_vdupq_n_bf16(bfloat16_t v) {
+  return vdupq_n_bf16(v);
+}
+
+// CHECK-LABEL: @test_vdup_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <4 x bfloat> [[V:%.*]], <4 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[LANE]]
+//
+bfloat16x4_t test_vdup_lane_bf16(bfloat16x4_t v) {
+  return vdup_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vdupq_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <4 x bfloat> [[V:%.*]], <4 x bfloat> undef, <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[LANE]]
+//
+bfloat16x8_t test_vdupq_lane_bf16(bfloat16x4_t v) {
+  return vdupq_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vdup_laneq_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <8 x bfloat> [[V:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[LANE]]
+//
+bfloat16x4_t test_vdup_laneq_bf16(bfloat16x8_t v) {
+  return vdup_laneq_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vdupq_laneq_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <8 x bfloat> [[V:%.*]], <8 x bfloat> undef, <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[LANE]]
+//
+bfloat16x8_t test_vdupq_laneq_bf16(bfloat16x8_t v) {
+  return vdupq_laneq_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vcombine_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <4 x bfloat> [[LOW:%.*]], <4 x bfloat> [[HIGH:%.*]], <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x8_t test_vcombine_bf16(bfloat16x4_t low, bfloat16x4_t high) {
+  return vcombine_bf16(low, high);
+}
+
+// CHECK-LABEL: @test_vget_high_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <8 x bfloat> [[A:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x4_t test_vget_high_bf16(bfloat16x8_t a) {
+  return vget_high_bf16(a);
+}
+
+// CHECK-LABEL: @test_vget_low_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <8 x bfloat> [[A:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x4_t test_vget_low_bf16(bfloat16x8_t a) {
+  return vget_low_bf16(a);
+}
+
+// CHECK-LABEL: @test_vget_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DOTCAST1:%.*]] = extractelement <4 x bfloat> [[V:%.*]], i32 1
+// CHECK-NEXT:ret bfloat [[DOTCAST1]]
+//
+bfloat16_t test_vget_lane_bf16(bfloat16x4_t v) {
+  return vget_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vgetq_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DOTCAST1:%.*]] = extractelement <8 x bfloat> [[V:%.*]], i32 7
+// CHECK-NEXT:ret bfloat [[DOTCAST1]]
+//
+bfloat16_t test_vgetq_lane_bf16(bfloat16x8_t v) {
+  return vgetq_lane_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vset_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x bfloat> [[V:%.*]], bfloat [[A:%.*]], i32 1
+// CHECK-NEXT:ret <4 x bfloat> [[TMP0]]

[PATCH] D79710: [clang][BFloat] Add create/set/get/dup intrinsics

2020-06-03 Thread Mikhail Maltsev via Phabricator via cfe-commits
miyuki updated this revision to Diff 268162.
miyuki retitled this revision from "[clang][BFloat] add create/set/get/dup 
intrinsics" to "[clang][BFloat] Add create/set/get/dup intrinsics".
miyuki edited the summary of this revision.
miyuki added a comment.

Addressed reviewers' comments.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
  clang/test/CodeGen/arm-bf16-getset-intrinsics.c

Index: clang/test/CodeGen/arm-bf16-getset-intrinsics.c
===
--- /dev/null
+++ clang/test/CodeGen/arm-bf16-getset-intrinsics.c
@@ -0,0 +1,151 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-feature +neon -target-feature +bf16 \
+// RUN:  -disable-O0-optnone -emit-llvm %s -o - | opt -S -mem2reg -instcombine | FileCheck %s
+
+#include 
+
+// CHECK-LABEL: @test_vcreate_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast i64 [[A:%.*]] to <4 x bfloat>
+// CHECK-NEXT:ret <4 x bfloat> [[TMP0]]
+//
+bfloat16x4_t test_vcreate_bf16(uint64_t a) {
+  return vcreate_bf16(a);
+}
+
+// CHECK-LABEL: @test_vdup_n_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <4 x bfloat> undef, bfloat [[V:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT3_I:%.*]] = shufflevector <4 x bfloat> [[VECINIT_I]], <4 x bfloat> undef, <4 x i32> zeroinitializer
+// CHECK-NEXT:ret <4 x bfloat> [[VECINIT3_I]]
+//
+bfloat16x4_t test_vdup_n_bf16(bfloat16_t v) {
+  return vdup_n_bf16(v);
+}
+
+// CHECK-LABEL: @test_vdupq_n_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <8 x bfloat> undef, bfloat [[V:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT7_I:%.*]] = shufflevector <8 x bfloat> [[VECINIT_I]], <8 x bfloat> undef, <8 x i32> zeroinitializer
+// CHECK-NEXT:ret <8 x bfloat> [[VECINIT7_I]]
+//
+bfloat16x8_t test_vdupq_n_bf16(bfloat16_t v) {
+  return vdupq_n_bf16(v);
+}
+
+// CHECK-LABEL: @test_vdup_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <4 x bfloat> [[V:%.*]], <4 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[LANE]]
+//
+bfloat16x4_t test_vdup_lane_bf16(bfloat16x4_t v) {
+  return vdup_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vdupq_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <4 x bfloat> [[V:%.*]], <4 x bfloat> undef, <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[LANE]]
+//
+bfloat16x8_t test_vdupq_lane_bf16(bfloat16x4_t v) {
+  return vdupq_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vdup_laneq_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <8 x bfloat> [[V:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[LANE]]
+//
+bfloat16x4_t test_vdup_laneq_bf16(bfloat16x8_t v) {
+  return vdup_laneq_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vdupq_laneq_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[LANE:%.*]] = shufflevector <8 x bfloat> [[V:%.*]], <8 x bfloat> undef, <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[LANE]]
+//
+bfloat16x8_t test_vdupq_laneq_bf16(bfloat16x8_t v) {
+  return vdupq_laneq_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vcombine_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <4 x bfloat> [[LOW:%.*]], <4 x bfloat> [[HIGH:%.*]], <8 x i32> 
+// CHECK-NEXT:ret <8 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x8_t test_vcombine_bf16(bfloat16x4_t low, bfloat16x4_t high) {
+  return vcombine_bf16(low, high);
+}
+
+// CHECK-LABEL: @test_vget_high_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <8 x bfloat> [[A:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x4_t test_vget_high_bf16(bfloat16x8_t a) {
+  return vget_high_bf16(a);
+}
+
+// CHECK-LABEL: @test_vget_low_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[SHUFFLE_I:%.*]] = shufflevector <8 x bfloat> [[A:%.*]], <8 x bfloat> undef, <4 x i32> 
+// CHECK-NEXT:ret <4 x bfloat> [[SHUFFLE_I]]
+//
+bfloat16x4_t test_vget_low_bf16(bfloat16x8_t a) {
+  return vget_low_bf16(a);
+}
+
+// CHECK-LABEL: @test_vget_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DOTCAST1:%.*]] = extractelement <4 x bfloat> [[V:%.*]], i32 1
+// CHECK-NEXT:ret bfloat [[DOTCAST1]]
+//
+bfloat16_t test_vget_lane_bf16(bfloat16x4_t v) {
+  return vget_lane_bf16(v, 1);
+}
+
+// CHECK-LABEL: @test_vgetq_lane_bf16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DOTCAST1:%.*]] = extractelement <8 x bfloat> [[V:%.*]], i32 7
+// CHECK-NEXT:ret bfloat [[DOTCAST1]]
+//
+bfloat16_t test_vgetq_lane_bf16(bfloat16x8_t v) {
+  return vgetq_lane_bf16(v, 7);
+}
+
+// CHECK-LABEL: @test_vset_lane_bf16(
+// CHECK-NEXT:  entry:
+// 

[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-06-03 Thread Mikhail Maltsev via Phabricator via cfe-commits
miyuki added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1860
+
+  def VGET_HIGH_BF : NoTestOpInst<"vget_high", ".Q", "b", OP_HI>;
+  def VGET_LOW_BF  : NoTestOpInst<"vget_low", ".Q", "b", OP_LO>;

dmgreen wrote:
> Do you know what InstName = "vmov" does, and is it needed here?
> 
> I'm pretty sure it's a vestige of an earlier implementation of the neon 
> emitter.
It looks like `InstName` was removed from NeonEmitter in
```
commit dee4ab08ba956efd76aa10da46510dcddecceacf
Author: James Molloy 
Date:   Tue Jun 17 13:11:27 2014 +

Rewrite ARM NEON intrinsic emission completely.
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-06-02 Thread Ties Stuij via Phabricator via cfe-commits
stuij marked an inline comment as done.
stuij added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1854
+  def VDUP_LANE_BF : WOpInst<"vdup_lane", ".qI", "bQb", OP_DUP_LN>;
+  def VDUP_LANEQ_BF: WOpInst<"vdup_laneq", ".QI", "bQb", OP_DUP_LN> {
+let isLaneQ = 1;

labrinea wrote:
> My local build points here with:
> `arm_neon.td:1926:3: error: No compatible intrinsic found - looking up 
> intrinsic 'splat_laneq(bfloat16x8_t, int32_t)'`
> 
> 
Thanks, yes we need to upstream another patch to match the upstreamed work 
already done for other types.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-06-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1854
+  def VDUP_LANE_BF : WOpInst<"vdup_lane", ".qI", "bQb", OP_DUP_LN>;
+  def VDUP_LANEQ_BF: WOpInst<"vdup_laneq", ".QI", "bQb", OP_DUP_LN> {
+let isLaneQ = 1;

My local build points here with:
`arm_neon.td:1926:3: error: No compatible intrinsic found - looking up 
intrinsic 'splat_laneq(bfloat16x8_t, int32_t)'`




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-29 Thread Dave Green via Phabricator via cfe-commits
dmgreen added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1860
+
+  def VGET_HIGH_BF : NoTestOpInst<"vget_high", ".Q", "b", OP_HI>;
+  def VGET_LOW_BF  : NoTestOpInst<"vget_low", ".Q", "b", OP_LO>;

Do you know what InstName = "vmov" does, and is it needed here?

I'm pretty sure it's a vestige of an earlier implementation of the neon emitter.



Comment at: clang/include/clang/Basic/arm_neon.td:1867
+  def SCALAR_VDUP_LANE_BF : IInst<"vdup_lane", "1.I", "Sb">;
+  def SCALAR_VDUP_LANEQ_BF : IInst<"vdup_laneq", "1QI", "Sb">;
+}

Does this need let isLaneQ = 1, like the other vdup_laneq's?



Comment at: clang/include/clang/Basic/arm_neon_incl.td:293
+
+  string CartesianProductWith = "";
 }

Is this needed in this patch?



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:6309
+  case NEON::BI__builtin_neon_vget_lane_bf16:
+  case NEON::BI__builtin_neon_vduph_lane_bf16:
   case NEON::BI__builtin_neon_vgetq_lane_i8:

How come these are needed for vduph_lane_bf16 if they were not needed for 
vduph_lane_f16?



Comment at: clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c:2
+// RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-feature +neon 
-target-feature +bf16 \
+// RUN:  -O2 -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK64
+// RUN: %clang_cc1 -triple armv8.6a-arm-none-eabi -target-feature +neon 
-target-feature +bf16 -mfloat-abi hard \

It's best to have auto generated tests if we can, and ideally not rely on 
running the entire -O2 pipeline. I think a lot of the other tests use clang ... 
-disable-O0-optnone | opt -S -mem2reg | Filecheck ...

Also this aarch64 tests is running arm codegen too? Not sure if that matters, 
but I don't immediately see it being done elsewhere.



Comment at: clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c:19-20
+// CHECK-LABEL: test_vdup_n_bf16
+// CHECK64: %vecinit.i = insertelement <4 x bfloat> undef, bfloat %v, i32 0
+// CHECK32: %vecinit.i = insertelement <4 x bfloat> undef, bfloat %v, i32 0
+// CHECK: %vecinit{{.*}} = shufflevector <4 x bfloat> %vecinit.i, <4 x bfloat> 
undef, <4 x i32> zeroinitializer

A lot of these 32 and 64 bit lines look the same.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-22 Thread Ties Stuij via Phabricator via cfe-commits
stuij updated this revision to Diff 265764.
stuij added a comment.

moving 'CartesianProductWith' to more apt patch


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/include/clang/Basic/arm_neon_incl.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c

Index: clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
@@ -0,0 +1,120 @@
+// RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-feature +neon -target-feature +bf16 \
+// RUN:  -O2 -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK64
+// RUN: %clang_cc1 -triple armv8.6a-arm-none-eabi -target-feature +neon -target-feature +bf16 -mfloat-abi hard \
+// RUN:  -O2 -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK32
+
+#include 
+
+bfloat16x4_t test_vcreate_bf16(uint64_t a) {
+  return vcreate_bf16(a);
+}
+// CHECK-LABEL: test_vcreate_bf16
+// CHECK64: %0 = bitcast i64 %a to <4 x bfloat>
+// CHECK32: %0 = bitcast i64 %a to <4 x bfloat>
+
+bfloat16x4_t test_vdup_n_bf16(bfloat16_t v) {
+  return vdup_n_bf16(v);
+}
+// CHECK-LABEL: test_vdup_n_bf16
+// CHECK64: %vecinit.i = insertelement <4 x bfloat> undef, bfloat %v, i32 0
+// CHECK32: %vecinit.i = insertelement <4 x bfloat> undef, bfloat %v, i32 0
+// CHECK: %vecinit{{.*}} = shufflevector <4 x bfloat> %vecinit.i, <4 x bfloat> undef, <4 x i32> zeroinitializer
+
+bfloat16x8_t test_vdupq_n_bf16(bfloat16_t v) {
+  return vdupq_n_bf16(v);
+}
+// CHECK-LABEL: test_vdupq_n_bf16
+// CHECK64: %vecinit.i = insertelement <8 x bfloat> undef, bfloat %v, i32 0
+// CHECK32: %vecinit.i = insertelement <8 x bfloat> undef, bfloat %v, i32 0
+// CHECK:   %vecinit{{.*}} = shufflevector <8 x bfloat> %vecinit.i, <8 x bfloat> undef, <8 x i32> zeroinitializer
+
+bfloat16x4_t test_vdup_lane_bf16(bfloat16x4_t v) {
+  return vdup_lane_bf16(v, 1);
+}
+// CHECK-LABEL: test_vdup_lane_bf16
+// CHECK64: %lane = shufflevector <4 x bfloat> %v, <4 x bfloat> undef, <4 x i32> 
+// CHECK32: %lane = shufflevector <4 x bfloat> %v, <4 x bfloat> undef, <4 x i32> 
+
+bfloat16x8_t test_vdupq_lane_bf16(bfloat16x4_t v) {
+  return vdupq_lane_bf16(v, 1);
+}
+// CHECK-LABEL: test_vdupq_lane_bf16
+// CHECK64: %lane = shufflevector <4 x bfloat> %v, <4 x bfloat> undef, <8 x i32> 
+// CHECK32: %lane = shufflevector <4 x bfloat> %v, <4 x bfloat> undef, <8 x i32> 
+
+bfloat16x4_t test_vdup_laneq_bf16(bfloat16x8_t v) {
+  return vdup_laneq_bf16(v, 7);
+}
+// CHECK-LABEL: test_vdup_laneq_bf16
+// CHECK64: %lane = shufflevector <8 x bfloat> %v, <8 x bfloat> undef, <4 x i32> 
+// CHECK32: %lane = shufflevector <8 x bfloat> %v, <8 x bfloat> undef, <4 x i32> 
+
+bfloat16x8_t test_vdupq_laneq_bf16(bfloat16x8_t v) {
+  return vdupq_laneq_bf16(v, 7);
+}
+// CHECK-LABEL: test_vdupq_laneq_bf16
+// CHECK64: %lane = shufflevector <8 x bfloat> %v, <8 x bfloat> undef, <8 x i32> 
+// CHECK32: %lane = shufflevector <8 x bfloat> %v, <8 x bfloat> undef, <8 x i32> 
+
+bfloat16x8_t test_vcombine_bf16(bfloat16x4_t low, bfloat16x4_t high) {
+  return vcombine_bf16(low, high);
+}
+// CHECK-LABEL: test_vcombine_bf16
+// CHECK64: %shuffle.i = shufflevector <4 x bfloat> %low, <4 x bfloat> %high, <8 x i32> 
+// CHECK32: %shuffle.i = shufflevector <4 x bfloat> %low, <4 x bfloat> %high, <8 x i32> 
+
+bfloat16x4_t test_vget_high_bf16(bfloat16x8_t a) {
+  return vget_high_bf16(a);
+}
+// CHECK-LABEL: test_vget_high_bf16
+// CHECK64: %shuffle.i = shufflevector <8 x bfloat> %a, <8 x bfloat> undef, <4 x i32> 
+// CHECK32: %shuffle.i = shufflevector <8 x bfloat> %a, <8 x bfloat> undef, <4 x i32> 
+
+bfloat16x4_t test_vget_low_bf16(bfloat16x8_t a) {
+  return vget_low_bf16(a);
+}
+// CHECK-LABEL: test_vget_low_bf16
+// CHECK64: %shuffle.i = shufflevector <8 x bfloat> %a, <8 x bfloat> undef, <4 x i32> 
+// CHECK32: %shuffle.i = shufflevector <8 x bfloat> %a, <8 x bfloat> undef, <4 x i32> 
+
+bfloat16_t test_vget_lane_bf16(bfloat16x4_t v) {
+  return vget_lane_bf16(v, 1);
+}
+// CHECK-LABEL: test_vget_lane_bf16
+// CHECK64: %vget_lane = extractelement <4 x bfloat> %v, i32 1
+// CHECK32: %vget_lane = extractelement <4 x bfloat> %v, i32 1
+
+bfloat16_t test_vgetq_lane_bf16(bfloat16x8_t v) {
+  return vgetq_lane_bf16(v, 7);
+}
+// CHECK-LABEL: test_vgetq_lane_bf16
+// CHECK64: %vgetq_lane = extractelement <8 x bfloat> %v, i32 7
+// CHECK32: %vget_lane = extractelement <8 x bfloat> %v, i32 7
+
+bfloat16x4_t test_vset_lane_bf16(bfloat16_t a, bfloat16x4_t v) {
+  return vset_lane_bf16(a, v, 1);
+}
+// CHECK-LABEL: test_vset_lane_bf16
+// CHECK64: %vset_lane = insertelement <4 x bfloat> %v, bfloat %a, i32 1
+// CHECK32: %vset_lane = insertelement <4 x bfloat> %v, bfloat %a, i32 1
+
+bfloat16x8_t test_vsetq_lane_bf16(bfloat16_t a, bfloat16x8_t v) {
+  return 

[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-21 Thread Ties Stuij via Phabricator via cfe-commits
stuij added a comment.

In D79710#2041418 , @LukeGeeson wrote:

> Can you update the commit message in this differential as well please? Same 
> for the other commits :)


done


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-20 Thread Ties Stuij via Phabricator via cfe-commits
stuij marked an inline comment as done.
stuij added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1845
+
+// V8.2-A BFloat intrinsics
+let ArchGuard = "defined(__ARM_FEATURE_BF16_VECTOR_ARITHMETIC)" in {

labrinea wrote:
> v8.6-A ?
Yes V8.2-A, the feature has been added to the 8.2 architecture, but in a later 
release.

For example, the section on BFloat in the  Arm ARM reads:
`ARMv8.2-BF16, Armv8.2 AArch64 BFloat16 Extension` 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-20 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1845
+
+// V8.2-A BFloat intrinsics
+let ArchGuard = "defined(__ARM_FEATURE_BF16_VECTOR_ARITHMETIC)" in {

v8.6-A ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-18 Thread Luke Geeson via Phabricator via cfe-commits
LukeGeeson added a comment.

Can you update the commit message in this differential as well please? Same for 
the other commits :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-18 Thread Ties Stuij via Phabricator via cfe-commits
stuij updated this revision to Diff 264588.
stuij added a comment.

adhere to patch attribution conventions: change author to Ties, add all the 
contributors


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c

Index: clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
@@ -0,0 +1,120 @@
+// RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-feature +neon -target-feature +bf16 \
+// RUN:  -O2 -fallow-half-arguments-and-returns -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK64
+// RUN: %clang_cc1 -triple armv8.6a-arm-none-eabi -target-feature +neon -target-feature +bf16 -mfloat-abi hard \
+// RUN:  -O2 -fallow-half-arguments-and-returns -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK32
+
+#include 
+
+bfloat16x4_t test_vcreate_bf16(uint64_t a) {
+  return vcreate_bf16(a);
+}
+// CHECK-LABEL: test_vcreate_bf16
+// CHECK64: %0 = bitcast i64 %a to <4 x bfloat>
+// CHECK32: %0 = bitcast i64 %a to <4 x bfloat>
+
+bfloat16x4_t test_vdup_n_bf16(bfloat16_t v) {
+  return vdup_n_bf16(v);
+}
+// CHECK-LABEL: test_vdup_n_bf16
+// CHECK64: %vecinit.i = insertelement <4 x bfloat> undef, bfloat %v, i32 0
+// CHECK32: %vecinit.i = insertelement <4 x bfloat> undef, bfloat %v, i32 0
+// CHECK: %vecinit{{.*}} = shufflevector <4 x bfloat> %vecinit.i, <4 x bfloat> undef, <4 x i32> zeroinitializer
+
+bfloat16x8_t test_vdupq_n_bf16(bfloat16_t v) {
+  return vdupq_n_bf16(v);
+}
+// CHECK-LABEL: test_vdupq_n_bf16
+// CHECK64: %vecinit.i = insertelement <8 x bfloat> undef, bfloat %v, i32 0
+// CHECK32: %vecinit.i = insertelement <8 x bfloat> undef, bfloat %v, i32 0
+// CHECK:   %vecinit{{.*}} = shufflevector <8 x bfloat> %vecinit.i, <8 x bfloat> undef, <8 x i32> zeroinitializer
+
+bfloat16x4_t test_vdup_lane_bf16(bfloat16x4_t v) {
+  return vdup_lane_bf16(v, 1);
+}
+// CHECK-LABEL: test_vdup_lane_bf16
+// CHECK64: %lane = shufflevector <4 x bfloat> %v, <4 x bfloat> undef, <4 x i32> 
+// CHECK32: %lane = shufflevector <4 x bfloat> %v, <4 x bfloat> undef, <4 x i32> 
+
+bfloat16x8_t test_vdupq_lane_bf16(bfloat16x4_t v) {
+  return vdupq_lane_bf16(v, 1);
+}
+// CHECK-LABEL: test_vdupq_lane_bf16
+// CHECK64: %lane = shufflevector <4 x bfloat> %v, <4 x bfloat> undef, <8 x i32> 
+// CHECK32: %lane = shufflevector <4 x bfloat> %v, <4 x bfloat> undef, <8 x i32> 
+
+bfloat16x4_t test_vdup_laneq_bf16(bfloat16x8_t v) {
+  return vdup_laneq_bf16(v, 7);
+}
+// CHECK-LABEL: test_vdup_laneq_bf16
+// CHECK64: %lane = shufflevector <8 x bfloat> %v, <8 x bfloat> undef, <4 x i32> 
+// CHECK32: %lane = shufflevector <8 x bfloat> %v, <8 x bfloat> undef, <4 x i32> 
+
+bfloat16x8_t test_vdupq_laneq_bf16(bfloat16x8_t v) {
+  return vdupq_laneq_bf16(v, 7);
+}
+// CHECK-LABEL: test_vdupq_laneq_bf16
+// CHECK64: %lane = shufflevector <8 x bfloat> %v, <8 x bfloat> undef, <8 x i32> 
+// CHECK32: %lane = shufflevector <8 x bfloat> %v, <8 x bfloat> undef, <8 x i32> 
+
+bfloat16x8_t test_vcombine_bf16(bfloat16x4_t low, bfloat16x4_t high) {
+  return vcombine_bf16(low, high);
+}
+// CHECK-LABEL: test_vcombine_bf16
+// CHECK64: %shuffle.i = shufflevector <4 x bfloat> %low, <4 x bfloat> %high, <8 x i32> 
+// CHECK32: %shuffle.i = shufflevector <4 x bfloat> %low, <4 x bfloat> %high, <8 x i32> 
+
+bfloat16x4_t test_vget_high_bf16(bfloat16x8_t a) {
+  return vget_high_bf16(a);
+}
+// CHECK-LABEL: test_vget_high_bf16
+// CHECK64: %shuffle.i = shufflevector <8 x bfloat> %a, <8 x bfloat> undef, <4 x i32> 
+// CHECK32: %shuffle.i = shufflevector <8 x bfloat> %a, <8 x bfloat> undef, <4 x i32> 
+
+bfloat16x4_t test_vget_low_bf16(bfloat16x8_t a) {
+  return vget_low_bf16(a);
+}
+// CHECK-LABEL: test_vget_low_bf16
+// CHECK64: %shuffle.i = shufflevector <8 x bfloat> %a, <8 x bfloat> undef, <4 x i32> 
+// CHECK32: %shuffle.i = shufflevector <8 x bfloat> %a, <8 x bfloat> undef, <4 x i32> 
+
+bfloat16_t test_vget_lane_bf16(bfloat16x4_t v) {
+  return vget_lane_bf16(v, 1);
+}
+// CHECK-LABEL: test_vget_lane_bf16
+// CHECK64: %vget_lane = extractelement <4 x bfloat> %v, i32 1
+// CHECK32: %vget_lane = extractelement <4 x bfloat> %v, i32 1
+
+bfloat16_t test_vgetq_lane_bf16(bfloat16x8_t v) {
+  return vgetq_lane_bf16(v, 7);
+}
+// CHECK-LABEL: test_vgetq_lane_bf16
+// CHECK64: %vgetq_lane = extractelement <8 x bfloat> %v, i32 7
+// CHECK32: %vget_lane = extractelement <8 x bfloat> %v, i32 7
+
+bfloat16x4_t test_vset_lane_bf16(bfloat16_t a, bfloat16x4_t v) {
+  return vset_lane_bf16(a, v, 1);
+}
+// CHECK-LABEL: test_vset_lane_bf16
+// CHECK64: %vset_lane = insertelement <4 x bfloat> %v, bfloat %a, i32 1
+// CHECK32: %vset_lane = insertelement <4 x bfloat> %v, bfloat %a, i32 1
+
+bfloat16x8_t 

[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-13 Thread Luke Geeson via Phabricator via cfe-commits
LukeGeeson added a comment.

I was an author for part of this patch. Please add all authors as a list of 
authors to this commit message. Thanks!

As an aside, it would be worth doing this for all the patches in this series


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-13 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c:119-120
+// CHECK-LABEL: test_vduph_laneq_bf16
+// CHECK64: %vgetq_lane = extractelement <8 x bfloat> %v, i32 7
+// CHECK32: %vget_lane = extractelement <8 x bfloat> %v, i32 7

This seems to be the only place where you need to differentiate between check32 
and check64, and I am not 100% sure the extra `q` in the name of the variable 
is relevant in terms of codegen testing.

Maybe you can just test both aarch32 and aarch64 with the same `CHECK` prefix?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits