from:"Thomas Lively via Phabricator via cfe\-commits"

[PATCH] D158409: [WebAssembly] Add multiple memories feature

2023-08-21 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG86ed8cb8fabe: [WebAssembly] Add multiple memories feature 
(authored by ashleynh, committed by tlively).

Changed prior to commit:
  https://reviews.llvm.org/D158409?vs=552097=552131#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158409/new/

https://reviews.llvm.org/D158409

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Basic/Targets/WebAssembly.cpp
  clang/lib/Basic/Targets/WebAssembly.h
  clang/test/Driver/wasm-features.c
  clang/test/Preprocessor/wasm-target-features.c
  llvm/lib/Target/WebAssembly/WebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
  llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h

Index: llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h
===
--- llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h
+++ llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h
@@ -49,6 +49,7 @@
   bool HasTailCall = false;
   bool HasReferenceTypes = false;
   bool HasExtendedConst = false;
+  bool HasMultiMemory = false;
 
   /// What processor and OS we're targeting.
   Triple TargetTriple;
@@ -101,6 +102,7 @@
   bool hasMutableGlobals() const { return HasMutableGlobals; }
   bool hasTailCall() const { return HasTailCall; }
   bool hasReferenceTypes() const { return HasReferenceTypes; }
+  bool hasMultiMemory() const { return HasMultiMemory; }
 
   /// Parses features string setting specified subtarget options. Definition of
   /// function is auto generated by tblgen.
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
@@ -70,6 +70,10 @@
 Predicate<"Subtarget->hasExtendedConst()">,
 AssemblerPredicate<(all_of FeatureExtendedConst), "extended-const">;
 
+def HasMultiMemory :
+Predicate<"Subtarget->hasMultiMemory()">,
+AssemblerPredicate<(all_of FeatureMultiMemory), "multimemory">;
+
 //===--===//
 // WebAssembly-specific DAG Node Types.
 //===--===//
Index: llvm/lib/Target/WebAssembly/WebAssembly.td
===
--- llvm/lib/Target/WebAssembly/WebAssembly.td
+++ llvm/lib/Target/WebAssembly/WebAssembly.td
@@ -71,6 +71,10 @@
   SubtargetFeature<"extended-const", "HasExtendedConst", "true",
"Enable extended const expressions">;
 
+def FeatureMultiMemory :
+  SubtargetFeature<"multimemory", "HasMultiMemory", "true",
+   "Enable multiple memories">;
+
 //===--===//
 // Architectures.
 //===--===//
Index: clang/test/Preprocessor/wasm-target-features.c
===
--- clang/test/Preprocessor/wasm-target-features.c
+++ clang/test/Preprocessor/wasm-target-features.c
@@ -114,6 +114,16 @@
 // RUN:   | FileCheck %s -check-prefix=EXTENDED-CONST
 //
 // EXTENDED-CONST:#define __wasm_extended_const__ 1{{$}}
+//
+// RUN: %clang -E -dM %s -o - 2>&1 \
+// RUN: -target wasm32-unknown-unknown -mmultimemory \
+// RUN:   | FileCheck %s -check-prefix=MULTIMEMORY
+// RUN: %clang -E -dM %s -o - 2>&1 \
+// RUN: -target wasm64-unknown-unknown -mmultimemory \
+// RUN:   | FileCheck %s -check-prefix=MULTIMEMORY
+//
+// MULTIMEMORY:#define __wasm_multimemory__ 1{{$}}
+//
 
 // RUN: %clang -E -dM %s -o - 2>&1 \
 // RUN: -target wasm32-unknown-unknown -mcpu=mvp \
@@ -133,6 +143,7 @@
 // MVP-NOT:#define __wasm_tail_call__
 // MVP-NOT:#define __wasm_reference_types__
 // MVP-NOT:#define __wasm_extended_const__
+// MVP-NOT:#define __wasm_multimemory__
 
 // RUN: %clang -E -dM %s -o - 2>&1 \
 // RUN: -target wasm32-unknown-unknown -mcpu=bleeding-edge \
@@ -148,6 +159,7 @@
 // BLEEDING-EDGE-DAG:#define __wasm_atomics__ 1{{$}}
 // BLEEDING-EDGE-DAG:#define __wasm_mutable_globals__ 1{{$}}
 // BLEEDING-EDGE-DAG:#define __wasm_tail_call__ 1{{$}}
+// BLEEDING-EDGE-DAG:#define __wasm_multimemory__ 1{{$}}
 // BLEEDING-EDGE-NOT:#define __wasm_unimplemented_simd128__ 1{{$}}
 // BLEEDING-EDGE-NOT:#define __wasm_exception_handling__ 1{{$}}
 // BLEEDING-EDGE-NOT:#define __wasm_multivalue__ 1{{$}}
Index: clang/test/Driver/wasm-features.c
===
--- clang/test/Driver/wasm-features.c
+++ clang/test/Driver/wasm-features.c
@@ -41,3 +41,12 @@
 // DEFAULT-NOT: "-target-feature" "-nontrapping-fptoint"
 // MVP-NOT: "-target-feature" "+nontrapping-fptoint"
 //

[PATCH] D158409: [WebAssembly] Add multiple memories feature

2023-08-21 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: clang/include/clang/Driver/Options.td:4583-4584
 def mno_extended_const : Flag<["-"], "mno-extended-const">, 
Group;
+def mmulti_memories : Flag<["-"], "mmulti-memories">, 
Group;
+def mno_multi_memories : Flag<["-"], "mno-multi-memories">, 
Group;
 def mexec_model_EQ : Joined<["-"], "mexec-model=">, 
Group,

aheejin wrote:
> aheejin wrote:
> > sbc100 wrote:
> > > tlively wrote:
> > > > Can we call this "multimemory" for consistency with "multivalue" above?
> > > How about just `multi_memory`?
> > > 
> > > In the past we have talked about "multi-table" and "multi-memory" without 
> > > using the plural here and the proposal itself is names using the singular 
> > > (https://github.com/WebAssembly/multi-memory).
> > I like `multi-memory` more, but I preferred `multi-value` too when it was 
> > introduced... 
> But if we are going to remove `-` here, we should treat it consistently in 
> all other places, for example, it should be not `HasMultiMemory` but 
> `HasMultimemory` in all code. We treat multivalue that way. Do we want that?
I prefer HasMultiMemory fwiw  


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158409/new/

https://reviews.llvm.org/D158409

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158409: [WebAssembly] Add multiple memories feature

2023-08-21 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

The other file to update here is clang/test/Driver/wasm-features.c. It looks 
like we haven't been as consistent about updating that one.




Comment at: clang/include/clang/Driver/Options.td:4583-4584
 def mno_extended_const : Flag<["-"], "mno-extended-const">, 
Group;
+def mmulti_memories : Flag<["-"], "mmulti-memories">, 
Group;
+def mno_multi_memories : Flag<["-"], "mno-multi-memories">, 
Group;
 def mexec_model_EQ : Joined<["-"], "mexec-model=">, 
Group,

Can we call this "multimemory" for consistency with "multivalue" above?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158409/new/

https://reviews.llvm.org/D158409

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151820: [clang][WebAssembly] Fix __BIGGEST_ALIGNMENT__ under emscripten

2023-06-06 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

In D151820#4385512 , @dschuff wrote:

> I seem to recall that @tlively and I spent a bunch of time with XNNpack 
> chasing down some kind of subtle error that I suspect had to do with 
> alignment, but maybe he remembers that better than I do.

Sorry, I have no recollection of this at all 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151820/new/

https://reviews.llvm.org/D151820

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150833: [WebAssembly] Add wasm_simd128.h intrinsics for relaxed SIMD

2023-05-18 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGc672c3fe05ad: [WebAssembly] Add wasm_simd128.h intrinsics 
for relaxed SIMD (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150833/new/

https://reviews.llvm.org/D150833

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/Headers/wasm_simd128.h
  cross-project-tests/intrinsic-header-tests/wasm_simd128.c

Index: cross-project-tests/intrinsic-header-tests/wasm_simd128.c
===
--- cross-project-tests/intrinsic-header-tests/wasm_simd128.c
+++ cross-project-tests/intrinsic-header-tests/wasm_simd128.c
@@ -1,7 +1,8 @@
 // REQUIRES: webassembly-registered-target
 // expected-no-diagnostics
 
-// RUN: %clang %s -O2 -S -o - -target wasm32-unknown-unknown -msimd128 -Wcast-qual -Werror | FileCheck %s
+// RUN: %clang %s -O2 -S -o - -target wasm32-unknown-unknown \
+// RUN: -msimd128 -mrelaxed-simd -Wcast-qual -Werror | FileCheck %s
 
 #include 
 
@@ -1264,3 +1265,123 @@
 v128_t test_i16x8_q15mulr_sat(v128_t a, v128_t b) {
   return wasm_i16x8_q15mulr_sat(a, b);
 }
+
+// CHECK-LABEL: test_f32x4_relaxed_madd:
+// CHECK: f32x4.relaxed_madd{{$}}
+v128_t test_f32x4_relaxed_madd(v128_t a, v128_t b, v128_t c) {
+  return wasm_f32x4_relaxed_madd(a, b, c);
+}
+
+// CHECK-LABEL: test_f32x4_relaxed_nmadd:
+// CHECK: f32x4.relaxed_nmadd{{$}}
+v128_t test_f32x4_relaxed_nmadd(v128_t a, v128_t b, v128_t c) {
+  return wasm_f32x4_relaxed_nmadd(a, b, c);
+}
+
+// CHECK-LABEL: test_f64x2_relaxed_madd:
+// CHECK: f64x2.relaxed_madd{{$}}
+v128_t test_f64x2_relaxed_madd(v128_t a, v128_t b, v128_t c) {
+  return wasm_f64x2_relaxed_madd(a, b, c);
+}
+
+// CHECK-LABEL: test_f64x2_relaxed_nmadd:
+// CHECK: f64x2.relaxed_nmadd{{$}}
+v128_t test_f64x2_relaxed_nmadd(v128_t a, v128_t b, v128_t c) {
+  return wasm_f64x2_relaxed_nmadd(a, b, c);
+}
+
+// CHECK-LABEL: test_i8x16_relaxed_laneselect:
+// CHECK: i8x16.relaxed_laneselect{{$}}
+v128_t test_i8x16_relaxed_laneselect(v128_t a, v128_t b, v128_t m) {
+  return wasm_i8x16_relaxed_laneselect(a, b, m);
+}
+
+// CHECK-LABEL: test_i16x8_relaxed_laneselect:
+// CHECK: i16x8.relaxed_laneselect{{$}}
+v128_t test_i16x8_relaxed_laneselect(v128_t a, v128_t b, v128_t m) {
+  return wasm_i16x8_relaxed_laneselect(a, b, m);
+}
+
+// CHECK-LABEL: test_i32x4_relaxed_laneselect:
+// CHECK: i32x4.relaxed_laneselect{{$}}
+v128_t test_i32x4_relaxed_laneselect(v128_t a, v128_t b, v128_t m) {
+  return wasm_i32x4_relaxed_laneselect(a, b, m);
+}
+
+// CHECK-LABEL: test_i64x2_relaxed_laneselect:
+// CHECK: i64x2.relaxed_laneselect{{$}}
+v128_t test_i64x2_relaxed_laneselect(v128_t a, v128_t b, v128_t m) {
+  return wasm_i64x2_relaxed_laneselect(a, b, m);
+}
+
+// CHECK-LABEL: test_i8x16_relaxed_swizzle:
+// CHECK: i8x16.relaxed_swizzle{{$}}
+v128_t test_i8x16_relaxed_swizzle(v128_t a, v128_t s) {
+  return wasm_i8x16_relaxed_swizzle(a, s);
+}
+
+// CHECK-LABEL: test_f32x4_relaxed_min:
+// CHECK: f32x4.relaxed_min{{$}}
+v128_t test_f32x4_relaxed_min(v128_t a, v128_t b) {
+  return wasm_f32x4_relaxed_min(a, b);
+}
+
+// CHECK-LABEL: test_f32x4_relaxed_max:
+// CHECK: f32x4.relaxed_max{{$}}
+v128_t test_f32x4_relaxed_max(v128_t a, v128_t b) {
+  return wasm_f32x4_relaxed_max(a, b);
+}
+
+// CHECK-LABEL: test_f64x2_relaxed_min:
+// CHECK: f64x2.relaxed_min{{$}}
+v128_t test_f64x2_relaxed_min(v128_t a, v128_t b) {
+  return wasm_f64x2_relaxed_min(a, b);
+}
+
+// CHECK-LABEL: test_f64x2_relaxed_max:
+// CHECK: f64x2.relaxed_max
+v128_t test_f64x2_relaxed_max(v128_t a, v128_t b) {
+  return wasm_f64x2_relaxed_max(a, b);
+}
+
+// CHECK-LABEL: test_i32x4_relaxed_trunc_f32x4:
+// CHECK: i32x4.relaxed_trunc_f32x4_s{{$}}
+v128_t test_i32x4_relaxed_trunc_f32x4(v128_t a) {
+  return wasm_i32x4_relaxed_trunc_f32x4(a);
+}
+
+// CHECK-LABEL: test_u32x4_relaxed_trunc_f32x4:
+// CHECK: i32x4.relaxed_trunc_f32x4_u{{$}}
+v128_t test_u32x4_relaxed_trunc_f32x4(v128_t a) {
+  return wasm_u32x4_relaxed_trunc_f32x4(a);
+}
+
+// CHECK-LABEL: test_i32x4_relaxed_trunc_f64x2_zero:
+// CHECK: i32x4.relaxed_trunc_f64x2_s_zero{{$}}
+v128_t test_i32x4_relaxed_trunc_f64x2_zero(v128_t a) {
+  return wasm_i32x4_relaxed_trunc_f64x2_zero(a);
+}
+
+// CHECK-LABEL: test_u32x4_relaxed_trunc_f64x2_zero:
+// CHECK: i32x4.relaxed_trunc_f64x2_u_zero{{$}}
+v128_t test_u32x4_relaxed_trunc_f64x2_zero(v128_t a) {
+  return wasm_u32x4_relaxed_trunc_f64x2_zero(a);
+}
+
+// CHECK-LABEL: test_i16x8_relaxed_q15mulr:
+// CHECK: i16x8.relaxed_q15mulr_s{{$}}
+v128_t test_i16x8_relaxed_q15mulr(v128_t a, v128_t b) {
+  return wasm_i16x8_relaxed_q15mulr(a, b);
+}
+
+// CHECK-LABEL: test_i16x8_relaxed_dot_i8x16_i7x16:
+// CHECK: i16x8.relaxed_dot_i8x16_i7x16_s{{$}}
+v128_t test_i16x8_relaxed_dot_i8x16_i7x16(v128_t a, v128_t b) {
+  return wasm_i16x8_relaxed_dot_i8x16_i7x16(a, b);
+}
+
+// CHECK-LABEL: test_i32x4_relaxed_dot_i8x16_i7x16_add:

[PATCH] D150833: [WebAssembly] Add wasm_simd128.h intrinsics for relaxed SIMD

2023-05-17 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: pmatos, asb, wingo, ecnelises, sunfish, 
jgravelle-google, sbc100.
Herald added a project: All.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Add user-friendly intrinsic functions for all relaxed SIMD instructions
alongside the existing SIMD128 intrinsic functions in wasm_simd128.h. Test that
the new instrinsics lower to the expected instructions in the existing
cross-project-tests test file.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D150833

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/Headers/wasm_simd128.h
  cross-project-tests/intrinsic-header-tests/wasm_simd128.c

Index: cross-project-tests/intrinsic-header-tests/wasm_simd128.c
===
--- cross-project-tests/intrinsic-header-tests/wasm_simd128.c
+++ cross-project-tests/intrinsic-header-tests/wasm_simd128.c
@@ -1,7 +1,8 @@
 // REQUIRES: webassembly-registered-target
 // expected-no-diagnostics
 
-// RUN: %clang %s -O2 -S -o - -target wasm32-unknown-unknown -msimd128 -Wcast-qual -Werror | FileCheck %s
+// RUN: %clang %s -O2 -S -o - -target wasm32-unknown-unknown \
+// RUN: -msimd128 -mrelaxed-simd -Wcast-qual -Werror | FileCheck %s
 
 #include 
 
@@ -1264,3 +1265,123 @@
 v128_t test_i16x8_q15mulr_sat(v128_t a, v128_t b) {
   return wasm_i16x8_q15mulr_sat(a, b);
 }
+
+// CHECK-LABEL: test_f32x4_relaxed_madd:
+// CHECK: f32x4.relaxed_madd{{$}}
+v128_t test_f32x4_relaxed_madd(v128_t a, v128_t b, v128_t c) {
+  return wasm_f32x4_relaxed_madd(a, b, c);
+}
+
+// CHECK-LABEL: test_f32x4_relaxed_nmadd:
+// CHECK: f32x4.relaxed_nmadd{{$}}
+v128_t test_f32x4_relaxed_nmadd(v128_t a, v128_t b, v128_t c) {
+  return wasm_f32x4_relaxed_nmadd(a, b, c);
+}
+
+// CHECK-LABEL: test_f64x2_relaxed_madd:
+// CHECK: f64x2.relaxed_madd{{$}}
+v128_t test_f64x2_relaxed_madd(v128_t a, v128_t b, v128_t c) {
+  return wasm_f64x2_relaxed_madd(a, b, c);
+}
+
+// CHECK-LABEL: test_f64x2_relaxed_nmadd:
+// CHECK: f64x2.relaxed_nmadd{{$}}
+v128_t test_f64x2_relaxed_nmadd(v128_t a, v128_t b, v128_t c) {
+  return wasm_f64x2_relaxed_nmadd(a, b, c);
+}
+
+// CHECK-LABEL: test_i8x16_relaxed_laneselect:
+// CHECK: i8x16.relaxed_laneselect{{$}}
+v128_t test_i8x16_relaxed_laneselect(v128_t a, v128_t b, v128_t m) {
+  return wasm_i8x16_relaxed_laneselect(a, b, m);
+}
+
+// CHECK-LABEL: test_i16x8_relaxed_laneselect:
+// CHECK: i16x8.relaxed_laneselect{{$}}
+v128_t test_i16x8_relaxed_laneselect(v128_t a, v128_t b, v128_t m) {
+  return wasm_i16x8_relaxed_laneselect(a, b, m);
+}
+
+// CHECK-LABEL: test_i32x4_relaxed_laneselect:
+// CHECK: i32x4.relaxed_laneselect{{$}}
+v128_t test_i32x4_relaxed_laneselect(v128_t a, v128_t b, v128_t m) {
+  return wasm_i32x4_relaxed_laneselect(a, b, m);
+}
+
+// CHECK-LABEL: test_i64x2_relaxed_laneselect:
+// CHECK: i64x2.relaxed_laneselect{{$}}
+v128_t test_i64x2_relaxed_laneselect(v128_t a, v128_t b, v128_t m) {
+  return wasm_i64x2_relaxed_laneselect(a, b, m);
+}
+
+// CHECK-LABEL: test_i8x16_relaxed_swizzle:
+// CHECK: i8x16.relaxed_swizzle{{$}}
+v128_t test_i8x16_relaxed_swizzle(v128_t a, v128_t s) {
+  return wasm_i8x16_relaxed_swizzle(a, s);
+}
+
+// CHECK-LABEL: test_f32x4_relaxed_min:
+// CHECK: f32x4.relaxed_min{{$}}
+v128_t test_f32x4_relaxed_min(v128_t a, v128_t b) {
+  return wasm_f32x4_relaxed_min(a, b);
+}
+
+// CHECK-LABEL: test_f32x4_relaxed_max:
+// CHECK: f32x4.relaxed_max{{$}}
+v128_t test_f32x4_relaxed_max(v128_t a, v128_t b) {
+  return wasm_f32x4_relaxed_max(a, b);
+}
+
+// CHECK-LABEL: test_f64x2_relaxed_min:
+// CHECK: f64x2.relaxed_min{{$}}
+v128_t test_f64x2_relaxed_min(v128_t a, v128_t b) {
+  return wasm_f64x2_relaxed_min(a, b);
+}
+
+// CHECK-LABEL: test_f64x2_relaxed_max:
+// CHECK: f64x2.relaxed_max
+v128_t test_f64x2_relaxed_max(v128_t a, v128_t b) {
+  return wasm_f64x2_relaxed_max(a, b);
+}
+
+// CHECK-LABEL: test_i32x4_relaxed_trunc_f32x4:
+// CHECK: i32x4.relaxed_trunc_f32x4_s{{$}}
+v128_t test_i32x4_relaxed_trunc_f32x4(v128_t a) {
+  return wasm_i32x4_relaxed_trunc_f32x4(a);
+}
+
+// CHECK-LABEL: test_u32x4_relaxed_trunc_f32x4:
+// CHECK: i32x4.relaxed_trunc_f32x4_u{{$}}
+v128_t test_u32x4_relaxed_trunc_f32x4(v128_t a) {
+  return wasm_u32x4_relaxed_trunc_f32x4(a);
+}
+
+// CHECK-LABEL: test_i32x4_relaxed_trunc_f64x2_zero:
+// CHECK: i32x4.relaxed_trunc_f64x2_s_zero{{$}}
+v128_t test_i32x4_relaxed_trunc_f64x2_zero(v128_t a) {
+  return wasm_i32x4_relaxed_trunc_f64x2_zero(a);
+}
+
+// CHECK-LABEL: test_u32x4_relaxed_trunc_f64x2_zero:
+// CHECK: i32x4.relaxed_trunc_f64x2_u_zero{{$}}
+v128_t test_u32x4_relaxed_trunc_f64x2_zero(v128_t a) {
+  return wasm_u32x4_relaxed_trunc_f64x2_zero(a);
+}
+
+// CHECK-LABEL: test_i16x8_relaxed_q15mulr:
+// CHECK: i16x8.relaxed_q15mulr_s{{$}}
+v128_t test_i16x8_relaxed_q15mulr(v128_t a, v128_t b) {
+  return

[PATCH] D139010: [clang][WebAssembly] Implement support for table types and builtins

2023-05-17 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

It looks like the LLVM-side changes are generally moving Wasm type 
classification functions to a more global location. Since no other backend 
should care about these things, it would be better if we could get away without 
these changes.




Comment at: llvm/include/llvm/IR/Type.h:23
 #include "llvm/Support/TypeSize.h"
+#include "llvm/CodeGen/WasmAddressSpaces.h"
 #include 

I don't know if things in IR are supposed to depend on things in CodeGen. Is 
there precedent here?



Comment at: llvm/include/llvm/IR/Type.h:220-232
+  bool isWebAssemblyReferenceType() const { return 
isWebAssemblyExternrefType() || isWebAssemblyFuncrefType(); }
+
+  /// Return true if this is a WebAssembly Externref Type.
+  bool isWebAssemblyExternrefType() const {
+return getPointerAddressSpace() ==
+ WebAssembly::WasmAddressSpace::WASM_ADDRESS_SPACE_EXTERNREF;
+  }

Do these need to be in Type.h? If not, it would be good to keep them in a more 
Wasm-specific location.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139010/new/

https://reviews.llvm.org/D139010

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139010: [clang][WebAssembly] Implement support for table types and builtins

2023-05-15 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

Thanks for the ping, will take a look.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139010/new/

https://reviews.llvm.org/D139010

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139010: [clang][WebAssembly] Implement support for table types and builtins

2023-02-28 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

In D139010#4158540 , @aaron.ballman 
wrote:

> Roping in @jfb because I know he's been heavily involve in WebAssembly in the 
> past and he may have ideas/opinions.
>
> High-level question: are externref, funcref, and table types all part of the 
> WebAssembly standard, or are these extensions? (I wasn't aware that 
> WebAssembly was a W3C standard or I'd have been asking this question much 
> earlier -- sorry for that! I'm asking in regards to #4 in 
> https://clang.llvm.org/get_involved.html#criteria)

Yes, these features are all present in the WebAssembly standard: 
https://www.w3.org/TR/2022/WD-wasm-core-2-20220419/syntax/types.html#table-types


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139010/new/

https://reviews.llvm.org/D139010

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144169: [WebAssembly] Fix simd bit shift intrinsics codegen

2023-02-16 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

LGTM, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144169/new/

https://reviews.llvm.org/D144169

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D133574: [C2x] reject type definitions in offsetof

2023-01-13 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

FWIW I had to patch the musl used by Emscripten to work around this: 
https://github.com/emscripten-core/emscripten/pull/18510

The fix was simple and I don't have a strong opinion about the right way 
forward, but this change was slightly disruptive and I imagine other projects 
will have to work around it as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133574/new/

https://reviews.llvm.org/D133574

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D140773: [WebAssembly] Use `shufflevector` for shuffle

2023-01-05 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added inline comments.



Comment at: llvm/test/CodeGen/WebAssembly/simd-shuffle.ll:1-2
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs 
-disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals 
-wasm-keep-registers -mattr=+simd128,+relaxed-simd | FileCheck %s
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs 
-disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals 
-wasm-keep-registers -mattr=+simd128,+relaxed-simd -fast-isel | FileCheck %s
+

penzn wrote:
> Do we need both isel options? I felt bad about removing a test, but we don't 
> check anything specific to the first run line.
I think it's good to check with and without fast isel just as a sanity check 
that fast isel falls back to dag isel correctly, but that's also covered in 
other tests, so I don't feel strongly about keeping it in this file 
specifically. Either way is fine with me.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140773/new/

https://reviews.llvm.org/D140773

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D140773: [WebAssembly] Use `shufflevector` for shuffle

2023-01-03 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

LGTM % comment. Thanks for taking this!




Comment at: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll:159-160
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <16 x i8> @llvm.wasm.shuffle(
-  <16 x i8>, <16 x i8>, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32,
-  i32, i32, i32, i32, i32)
 define <16 x i8> @shuffle_v16i8(<16 x i8> %x, <16 x i8> %y) {
-  %res = call <16 x i8> @llvm.wasm.shuffle(<16 x i8> %x, <16 x i8> %y,
-  i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7,
-  i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 35)
+  %res = shufflevector <16 x i8> %x, <16 x i8> %y, <16 x i32> 
   ret <16 x i8> %res

Since this no longer tests codegen for an intrinsic function, could you move it 
to a separate test file? It could be named  something simple and short like 
`simd-shuffle.ll`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140773/new/

https://reviews.llvm.org/D140773

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D138249: [WebAssembly] Update relaxed-simd instruction names

2022-11-21 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGae96b5bd2dd0: [WebAssembly] Update relaxed-simd instruction 
names (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138249/new/

https://reviews.llvm.org/D138249

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -794,17 +794,17 @@
 # CHECK: i32x4.relaxed_trunc_f64x2_u_zero # encoding: [0xfd,0x84,0x02]
 i32x4.relaxed_trunc_f64x2_u_zero
 
-# CHECK: f32x4.relaxed_fma # encoding: [0xfd,0x85,0x02]
-f32x4.relaxed_fma
+# CHECK: f32x4.relaxed_madd # encoding: [0xfd,0x85,0x02]
+f32x4.relaxed_madd
 
-# CHECK: f32x4.relaxed_fms # encoding: [0xfd,0x86,0x02]
-f32x4.relaxed_fms
+# CHECK: f32x4.relaxed_nmadd # encoding: [0xfd,0x86,0x02]
+f32x4.relaxed_nmadd
 
-# CHECK: f64x2.relaxed_fma # encoding: [0xfd,0x87,0x02]
-f64x2.relaxed_fma
+# CHECK: f64x2.relaxed_madd # encoding: [0xfd,0x87,0x02]
+f64x2.relaxed_madd
 
-# CHECK: f64x2.relaxed_fms # encoding: [0xfd,0x88,0x02]
-f64x2.relaxed_fms
+# CHECK: f64x2.relaxed_nmadd # encoding: [0xfd,0x88,0x02]
+f64x2.relaxed_nmadd
 
 # CHECK: i8x16.relaxed_laneselect # encoding: [0xfd,0x89,0x02]
 i8x16.relaxed_laneselect
@@ -833,10 +833,10 @@
 # CHECK: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
 i16x8.relaxed_q15mulr_s
 
-# CHECK: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
-i16x8.dot_i8x16_i7x16_s
+# CHECK: i16x8.relaxed_dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
+i16x8.relaxed_dot_i8x16_i7x16_s
 
-# CHECK: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
-i32x4.dot_i8x16_i7x16_add_s
+# CHECK: i32x4.relaxed_dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+i32x4.relaxed_dot_i8x16_i7x16_add_s
 
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -184,9 +184,9 @@
 ; CHECK-NEXT: .functype laneselect_v16i8 (v128, v128, v128) -> (v128){{$}}
 ; CHECK-NEXT: i8x16.relaxed_laneselect $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <16 x i8> @llvm.wasm.laneselect.v16i8(<16 x i8>, <16 x i8>, <16 x i8>)
+declare <16 x i8> @llvm.wasm.relaxed.laneselect.v16i8(<16 x i8>, <16 x i8>, <16 x i8>)
 define <16 x i8> @laneselect_v16i8(<16 x i8> %a, <16 x i8> %b, <16 x i8> %c) {
-  %v = call <16 x i8> @llvm.wasm.laneselect.v16i8(
+  %v = call <16 x i8> @llvm.wasm.relaxed.laneselect.v16i8(
 <16 x i8> %a, <16 x i8> %b, <16 x i8> %c
   )
   ret <16 x i8> %v
@@ -360,9 +360,9 @@
 ; CHECK-NEXT: .functype laneselect_v8i16 (v128, v128, v128) -> (v128){{$}}
 ; CHECK-NEXT: i16x8.relaxed_laneselect $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.laneselect.v8i16(<8 x i16>, <8 x i16>, <8 x i16>)
+declare <8 x i16> @llvm.wasm.relaxed.laneselect.v8i16(<8 x i16>, <8 x i16>, <8 x i16>)
 define <8 x i16> @laneselect_v8i16(<8 x i16> %a, <8 x i16> %b, <8 x i16> %c) {
-  %v = call <8 x i16> @llvm.wasm.laneselect.v8i16(
+  %v = call <8 x i16> @llvm.wasm.relaxed.laneselect.v8i16(
 <8 x i16> %a, <8 x i16> %b, <8 x i16> %c
   )
   ret <8 x i16> %v
@@ -382,11 +382,11 @@
 
 ; CHECK-LABEL: dot_i8x16_i7x16_s_i16x8:
 ; CHECK-NEXT: .functype dot_i8x16_i7x16_s_i16x8 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: i16x8.relaxed_dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
+declare <8 x i16> @llvm.wasm.relaxed.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
 define <8 x i16> @dot_i8x16_i7x16_s_i16x8(<16 x i8> %a, <16 x i8> %b) {
-  %v = call <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(
+  %v = call <8 x i16> @llvm.wasm.relaxed.dot.i8x16.i7x16.signed(
 <16 x i8> %a, <16 x i8> %b
   )
   ret <8 x i16> %v
@@ -542,9 +542,9 @@
 ; CHECK-NEXT: .functype laneselect_v4i32 (v128, v128, v128) -> (v128){{$}}
 ; CHECK-NEXT: i32x4.relaxed_laneselect $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.laneselect.v4i32(<4 x i32>, <4 x i32>, <4 x i32>)
+declare <4 x i32> @llvm.wasm.relaxed.laneselect.v4i32(<4 x i32>, <4 x i32>, <4 x i32>)
 define

[PATCH] D138249: [WebAssembly] Update relaxed-simd instruction names

2022-11-18 Thread Thomas Lively via Phabricator via cfe-commits

tlively updated this revision to Diff 476490.
tlively added a comment.

- Fix type


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138249/new/

https://reviews.llvm.org/D138249

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -794,17 +794,17 @@
 # CHECK: i32x4.relaxed_trunc_f64x2_u_zero # encoding: [0xfd,0x84,0x02]
 i32x4.relaxed_trunc_f64x2_u_zero
 
-# CHECK: f32x4.relaxed_fma # encoding: [0xfd,0x85,0x02]
-f32x4.relaxed_fma
+# CHECK: f32x4.relaxed_madd # encoding: [0xfd,0x85,0x02]
+f32x4.relaxed_madd
 
-# CHECK: f32x4.relaxed_fms # encoding: [0xfd,0x86,0x02]
-f32x4.relaxed_fms
+# CHECK: f32x4.relaxed_nmadd # encoding: [0xfd,0x86,0x02]
+f32x4.relaxed_nmadd
 
-# CHECK: f64x2.relaxed_fma # encoding: [0xfd,0x87,0x02]
-f64x2.relaxed_fma
+# CHECK: f64x2.relaxed_madd # encoding: [0xfd,0x87,0x02]
+f64x2.relaxed_madd
 
-# CHECK: f64x2.relaxed_fms # encoding: [0xfd,0x88,0x02]
-f64x2.relaxed_fms
+# CHECK: f64x2.relaxed_nmadd # encoding: [0xfd,0x88,0x02]
+f64x2.relaxed_nmadd
 
 # CHECK: i8x16.relaxed_laneselect # encoding: [0xfd,0x89,0x02]
 i8x16.relaxed_laneselect
@@ -833,10 +833,10 @@
 # CHECK: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
 i16x8.relaxed_q15mulr_s
 
-# CHECK: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
-i16x8.dot_i8x16_i7x16_s
+# CHECK: i16x8.relaxed_dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
+i16x8.relaxed_dot_i8x16_i7x16_s
 
-# CHECK: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
-i32x4.dot_i8x16_i7x16_add_s
+# CHECK: i32x4.relaxed_dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+i32x4.relaxed_dot_i8x16_i7x16_add_s
 
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -184,9 +184,9 @@
 ; CHECK-NEXT: .functype laneselect_v16i8 (v128, v128, v128) -> (v128){{$}}
 ; CHECK-NEXT: i8x16.relaxed_laneselect $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <16 x i8> @llvm.wasm.laneselect.v16i8(<16 x i8>, <16 x i8>, <16 x i8>)
+declare <16 x i8> @llvm.wasm.relaxed.laneselect.v16i8(<16 x i8>, <16 x i8>, <16 x i8>)
 define <16 x i8> @laneselect_v16i8(<16 x i8> %a, <16 x i8> %b, <16 x i8> %c) {
-  %v = call <16 x i8> @llvm.wasm.laneselect.v16i8(
+  %v = call <16 x i8> @llvm.wasm.relaxed.laneselect.v16i8(
 <16 x i8> %a, <16 x i8> %b, <16 x i8> %c
   )
   ret <16 x i8> %v
@@ -360,9 +360,9 @@
 ; CHECK-NEXT: .functype laneselect_v8i16 (v128, v128, v128) -> (v128){{$}}
 ; CHECK-NEXT: i16x8.relaxed_laneselect $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.laneselect.v8i16(<8 x i16>, <8 x i16>, <8 x i16>)
+declare <8 x i16> @llvm.wasm.relaxed.laneselect.v8i16(<8 x i16>, <8 x i16>, <8 x i16>)
 define <8 x i16> @laneselect_v8i16(<8 x i16> %a, <8 x i16> %b, <8 x i16> %c) {
-  %v = call <8 x i16> @llvm.wasm.laneselect.v8i16(
+  %v = call <8 x i16> @llvm.wasm.relaxed.laneselect.v8i16(
 <8 x i16> %a, <8 x i16> %b, <8 x i16> %c
   )
   ret <8 x i16> %v
@@ -382,11 +382,11 @@
 
 ; CHECK-LABEL: dot_i8x16_i7x16_s_i16x8:
 ; CHECK-NEXT: .functype dot_i8x16_i7x16_s_i16x8 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: i16x8.relaxed_dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
+declare <8 x i16> @llvm.wasm.relaxed.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
 define <8 x i16> @dot_i8x16_i7x16_s_i16x8(<16 x i8> %a, <16 x i8> %b) {
-  %v = call <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(
+  %v = call <8 x i16> @llvm.wasm.relaxed.dot.i8x16.i7x16.signed(
 <16 x i8> %a, <16 x i8> %b
   )
   ret <8 x i16> %v
@@ -542,9 +542,9 @@
 ; CHECK-NEXT: .functype laneselect_v4i32 (v128, v128, v128) -> (v128){{$}}
 ; CHECK-NEXT: i32x4.relaxed_laneselect $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.laneselect.v4i32(<4 x i32>, <4 x i32>, <4 x i32>)
+declare <4 x i32> @llvm.wasm.relaxed.laneselect.v4i32(<4 x i32>, <4 x i32>, <4 x i32>)
 define <4 x i32> @laneselect_v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c) {
-  %v = call <4 x i32>

[PATCH] D138249: [WebAssembly] Update relaxed-simd instruction names

2022-11-18 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:180
 TARGET_BUILTIN(__builtin_wasm_relaxed_min_f64x2, "V2dV2dV2d", "nc", 
"relaxed-simd")
-TARGET_BUILTIN(__builtin_wasm_relaxed_max_f64x2, "V2dV2dV2d", "nc", 
"relaxed-simd")
+TARGET_BUILTIN(__builtin_wasm_relaxed_max_f64x2, "V2dV2dV2d", "nC", 
"relaxed-simd")
 

maratyszcza wrote:
> Typo?
Yep, thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138249/new/

https://reviews.llvm.org/D138249

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D138249: [WebAssembly] Update relaxed-simd instruction names

2022-11-17 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, maratyszcza.
Herald added subscribers: pmatos, asb, wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100, dschuff.
Herald added a project: All.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Including builtin and intrinsic names. These should be the final names for the
proposal.
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D138249

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -794,17 +794,17 @@
 # CHECK: i32x4.relaxed_trunc_f64x2_u_zero # encoding: [0xfd,0x84,0x02]
 i32x4.relaxed_trunc_f64x2_u_zero
 
-# CHECK: f32x4.relaxed_fma # encoding: [0xfd,0x85,0x02]
-f32x4.relaxed_fma
+# CHECK: f32x4.relaxed_madd # encoding: [0xfd,0x85,0x02]
+f32x4.relaxed_madd
 
-# CHECK: f32x4.relaxed_fms # encoding: [0xfd,0x86,0x02]
-f32x4.relaxed_fms
+# CHECK: f32x4.relaxed_nmadd # encoding: [0xfd,0x86,0x02]
+f32x4.relaxed_nmadd
 
-# CHECK: f64x2.relaxed_fma # encoding: [0xfd,0x87,0x02]
-f64x2.relaxed_fma
+# CHECK: f64x2.relaxed_madd # encoding: [0xfd,0x87,0x02]
+f64x2.relaxed_madd
 
-# CHECK: f64x2.relaxed_fms # encoding: [0xfd,0x88,0x02]
-f64x2.relaxed_fms
+# CHECK: f64x2.relaxed_nmadd # encoding: [0xfd,0x88,0x02]
+f64x2.relaxed_nmadd
 
 # CHECK: i8x16.relaxed_laneselect # encoding: [0xfd,0x89,0x02]
 i8x16.relaxed_laneselect
@@ -833,10 +833,10 @@
 # CHECK: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
 i16x8.relaxed_q15mulr_s
 
-# CHECK: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
-i16x8.dot_i8x16_i7x16_s
+# CHECK: i16x8.relaxed_dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
+i16x8.relaxed_dot_i8x16_i7x16_s
 
-# CHECK: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
-i32x4.dot_i8x16_i7x16_add_s
+# CHECK: i32x4.relaxed_dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+i32x4.relaxed_dot_i8x16_i7x16_add_s
 
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -184,9 +184,9 @@
 ; CHECK-NEXT: .functype laneselect_v16i8 (v128, v128, v128) -> (v128){{$}}
 ; CHECK-NEXT: i8x16.relaxed_laneselect $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <16 x i8> @llvm.wasm.laneselect.v16i8(<16 x i8>, <16 x i8>, <16 x i8>)
+declare <16 x i8> @llvm.wasm.relaxed.laneselect.v16i8(<16 x i8>, <16 x i8>, <16 x i8>)
 define <16 x i8> @laneselect_v16i8(<16 x i8> %a, <16 x i8> %b, <16 x i8> %c) {
-  %v = call <16 x i8> @llvm.wasm.laneselect.v16i8(
+  %v = call <16 x i8> @llvm.wasm.relaxed.laneselect.v16i8(
 <16 x i8> %a, <16 x i8> %b, <16 x i8> %c
   )
   ret <16 x i8> %v
@@ -360,9 +360,9 @@
 ; CHECK-NEXT: .functype laneselect_v8i16 (v128, v128, v128) -> (v128){{$}}
 ; CHECK-NEXT: i16x8.relaxed_laneselect $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.laneselect.v8i16(<8 x i16>, <8 x i16>, <8 x i16>)
+declare <8 x i16> @llvm.wasm.relaxed.laneselect.v8i16(<8 x i16>, <8 x i16>, <8 x i16>)
 define <8 x i16> @laneselect_v8i16(<8 x i16> %a, <8 x i16> %b, <8 x i16> %c) {
-  %v = call <8 x i16> @llvm.wasm.laneselect.v8i16(
+  %v = call <8 x i16> @llvm.wasm.relaxed.laneselect.v8i16(
 <8 x i16> %a, <8 x i16> %b, <8 x i16> %c
   )
   ret <8 x i16> %v
@@ -382,11 +382,11 @@
 
 ; CHECK-LABEL: dot_i8x16_i7x16_s_i16x8:
 ; CHECK-NEXT: .functype dot_i8x16_i7x16_s_i16x8 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: i16x8.relaxed_dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
+declare <8 x i16> @llvm.wasm.relaxed.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
 define <8 x i16> @dot_i8x16_i7x16_s_i16x8(<16 x i8> %a, <16 x i8> %b) {
-  %v = call <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(
+  %v = call <8 x i16> @llvm.wasm.relaxed.dot.i8x16.i7x16.signed(
 <16 x i8> %a, <16 x i8> %b
   )
   ret <8 x i16> %v
@@ -542,9 +542,9 @@
 ; CHECK-NEXT: .functype laneselect_v4i32 (v128, v128, v128) -> (v128){{$}}
 ;

[PATCH] D133428: [WebAssembly] Prototype `f32x4.relaxed_dot_bf16x8_add_f32`

2022-09-08 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGac3b8df8f2f4: [WebAssembly] Prototype 
`f32x4.relaxed_dot_bf16x8_add_f32` (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133428/new/

https://reviews.llvm.org/D133428

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -786,6 +786,20 @@
   ret <4 x float> %v
 }
 
+; CHECK-LABEL: relaxed_dot_bf16x8_add_f32:
+; CHECK-NEXT: .functype relaxed_dot_bf16x8_add_f32 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.relaxed_dot_bf16x8_add_f32 $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.relaxed.dot.bf16x8.add.f32(<8 x i16>, <8 x i16>,
+  <4 x float>)
+define <4 x float> @relaxed_dot_bf16x8_add_f32(<8 x i16> %a, <8 x i16> %b,
+   <4 x float> %c) {
+  %v = call <4 x float> @llvm.wasm.relaxed.dot.bf16x8.add.f32(
+<8 x i16> %a, <8 x i16> %b, <4 x float> %c
+  )
+  ret <4 x float> %v
+}
+
 ; ==
 ; 2 x f64
 ; ==
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1466,3 +1466,15 @@
(v16i8 V128:$lhs), (v16i8 V128:$rhs), (v4i32 V128:$acc)))],
 "i32x4.dot_i8x16_i7x16_add_s\t$dst, $lhs, $rhs, $acc",
 "i32x4.dot_i8x16_i7x16_add_s", 0x113>;
+
+//===--===//
+// Relaxed BFloat16 dot product
+//===--===//
+
+defm RELAXED_DOT_BFLOAT :
+  RELAXED_I<(outs V128:$dst), (ins V128:$lhs, V128:$rhs, V128:$acc),
+(outs), (ins),
+[(set (v4f32 V128:$dst), (int_wasm_relaxed_dot_bf16x8_add_f32
+   (v8i16 V128:$lhs), (v8i16 V128:$rhs), (v4f32 V128:$acc)))],
+"f32x4.relaxed_dot_bf16x8_add_f32\t$dst, $lhs, $rhs, $acc",
+"f32x4.relaxed_dot_bf16x8_add_f32", 0x114>;
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -285,6 +285,12 @@
 [llvm_v16i8_ty, llvm_v16i8_ty, llvm_v4i32_ty],
 [IntrNoMem, IntrSpeculatable]>;
 
+def int_wasm_relaxed_dot_bf16x8_add_f32:
+  Intrinsic<[llvm_v4f32_ty],
+[llvm_v8i16_ty, llvm_v8i16_ty, llvm_v4f32_ty],
+[IntrNoMem, IntrSpeculatable]>;
+
+
 //===--===//
 // Thread-local storage intrinsics
 //===--===//
Index: clang/test/CodeGen/builtins-wasm.c
===
--- clang/test/CodeGen/builtins-wasm.c
+++ clang/test/CodeGen/builtins-wasm.c
@@ -794,3 +794,10 @@
   // WEBASSEMBLY-SAME: <16 x i8> %a, <16 x i8> %b, <4 x i32> %c)
   // WEBASSEMBLY-NEXT: ret
 }
+
+f32x4 relaxed_dot_bf16x8_add_f32_f32x4(u16x8 a, u16x8 b, f32x4 c) {
+  return __builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4(a, b, c);
+  // WEBASSEMBLY: call <4 x float> @llvm.wasm.relaxed.dot.bf16x8.add.f32
+  // WEBASSEMBLY-SAME: <8 x i16> %a, <8 x i16> %b, <4 x float> %c)
+  // WEBASSEMBLY-NEXT: ret
+}
Index: clang/lib/CodeGen/CGBuiltin.cpp
===
--- clang/lib/CodeGen/CGBuiltin.cpp
+++ clang/lib/CodeGen/CGBuiltin.cpp
@@ -18870,6 +18870,14 @@
 CGM.getIntrinsic(Intrinsic::wasm_dot_i8x16_i7x16_add_signed);
 return Builder.CreateCall(Callee, {LHS, RHS, Acc});
   }
+  case WebAssembly::BI__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4: {
+Value *LHS = EmitScalarExpr(E->getArg(0));
+Value *RHS = EmitScalarExpr(E->getArg(1));
+Value *Acc = EmitScalarExpr(E->getArg(2));
+Function *Callee =
+CGM.getIntrinsic(Intrinsic::wasm_relaxed_dot_bf16x8_add_f32);
+return Builder.CreateCall(Callee, {LHS, RHS, Acc});
+  }
   default:
 return nullptr;
   }
Index: clang/include/clang/Basic/BuiltinsWebAssembly.def

[PATCH] D133428: [WebAssembly] Prototype `f32x4.relaxed_dot_bf16x8_add_f32`

2022-09-07 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added a reviewer: aheejin.
Herald added subscribers: pmatos, asb, wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100, dschuff.
Herald added a project: All.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

As proposed in https://github.com/WebAssembly/relaxed-simd/issues/77. Only an
LLVM intrinsic and a clang builtin are implemented. Since there is no bfloat16
type, use u16 to represent the bfloats in the builtin function arguments.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D133428

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -786,6 +786,20 @@
   ret <4 x float> %v
 }
 
+; CHECK-LABEL: relaxed_dot_bf16x8_add_f32:
+; CHECK-NEXT: .functype relaxed_dot_bf16x8_add_f32 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.relaxed_dot_bf16x8_add_f32 $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.relaxed.dot.bf16x8.add.f32(<8 x i16>, <8 x i16>,
+  <4 x float>)
+define <4 x float> @relaxed_dot_bf16x8_add_f32(<8 x i16> %a, <8 x i16> %b,
+   <4 x float> %c) {
+  %v = call <4 x float> @llvm.wasm.relaxed.dot.bf16x8.add.f32(
+<8 x i16> %a, <8 x i16> %b, <4 x float> %c
+  )
+  ret <4 x float> %v
+}
+
 ; ==
 ; 2 x f64
 ; ==
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1466,3 +1466,15 @@
(v16i8 V128:$lhs), (v16i8 V128:$rhs), (v4i32 V128:$acc)))],
 "i32x4.dot_i8x16_i7x16_add_s\t$dst, $lhs, $rhs, $acc",
 "i32x4.dot_i8x16_i7x16_add_s", 0x113>;
+
+//===--===//
+// Relaxed BFloat16 dot product
+//===--===//
+
+defm RELAXED_DOT_BFLOAT :
+  RELAXED_I<(outs V128:$dst), (ins V128:$lhs, V128:$rhs, V128:$acc),
+(outs), (ins),
+[(set (v4f32 V128:$dst), (int_wasm_relaxed_dot_bf16x8_add_f32
+   (v8i16 V128:$lhs), (v8i16 V128:$rhs), (v4f32 V128:$acc)))],
+"f32x4.relaxed_dot_bf16x8_add_f32\t$dst, $lhs, $rhs, $acc",
+"f32x4.relaxed_dot_bf16x8_add_f32", 0x114>;
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -285,6 +285,12 @@
 [llvm_v16i8_ty, llvm_v16i8_ty, llvm_v4i32_ty],
 [IntrNoMem, IntrSpeculatable]>;
 
+def int_wasm_relaxed_dot_bf16x8_add_f32:
+  Intrinsic<[llvm_v4f32_ty],
+[llvm_v8i16_ty, llvm_v8i16_ty, llvm_v4f32_ty],
+[IntrNoMem, IntrSpeculatable]>;
+
+
 //===--===//
 // Thread-local storage intrinsics
 //===--===//
Index: clang/test/CodeGen/builtins-wasm.c
===
--- clang/test/CodeGen/builtins-wasm.c
+++ clang/test/CodeGen/builtins-wasm.c
@@ -794,3 +794,10 @@
   // WEBASSEMBLY-SAME: <16 x i8> %a, <16 x i8> %b, <4 x i32> %c)
   // WEBASSEMBLY-NEXT: ret
 }
+
+f32x4 relaxed_dot_bf16x8_add_f32_f32x4(u16x8 a, u16x8 b, f32x4 c) {
+  return __builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4(a, b, c);
+  // WEBASSEMBLY: call <4 x float> @llvm.wasm.relaxed.dot.bf16x8.add.f32
+  // WEBASSEMBLY-SAME: <8 x i16> %a, <8 x i16> %b, <4 x float> %c)
+  // WEBASSEMBLY-NEXT: ret
+}
Index: clang/lib/CodeGen/CGBuiltin.cpp
===
--- clang/lib/CodeGen/CGBuiltin.cpp
+++ clang/lib/CodeGen/CGBuiltin.cpp
@@ -18868,6 +18868,14 @@
 CGM.getIntrinsic(Intrinsic::wasm_dot_i8x16_i7x16_add_signed);
 return Builder.CreateCall(Callee, {LHS, RHS, Acc});
   }
+  case WebAssembly::BI__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4: {
+Value *LHS = EmitScalarExpr(E->getArg(0));
+Value *RHS = EmitScalarExpr(E->getArg(1));
+

[PATCH] D128440: [WebAssembly] Initial support for reference type funcref in clang

2022-08-16 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

It would be good to add tests for more error conditions like in the externref 
patch and also additional errors like trying to apply __funcref to types that 
aren't function pointers.




Comment at: clang/include/clang/Basic/Attr.td:4053
+  let Spellings = [Keyword<"__funcref">];
+  let Documentation = [Undocumented];
+}

It would be good to document this!



Comment at: clang/lib/Sema/SemaChecking.cpp:4525
+  case WebAssembly::BI__builtin_wasm_ref_null_func:
+return (SemaBuiltinWasmRefNullFunc(TheCall));
+  }

Are the extra parentheses meaningful here?



Comment at: clang/lib/Sema/SemaChecking.cpp:6552-6555
+  // The call we get looks like
+  // CallExpr
+  // `- ImplicitCastExpr
+  //   `- DeclRefExpr

How do these parts correspond to the original source code?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128440/new/

https://reviews.llvm.org/D128440

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122215: [WebAssembly] Initial support for reference type externref in clang

2022-08-16 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

Very nice! This LGTM with all these small comments addressed. Sorry for the 
long delay in reviewing.




Comment at: clang/include/clang/AST/ASTContext.h:1149
 #include "clang/Basic/RISCVVTypes.def"
+#define WASM_TYPE(Name, Id, SingletonId) CanQualType SingletonId;
+#include "clang/Basic/WebAssemblyReferenceTypes.def"

I have no idea what's going on here in the code, but it seems that the existing 
convention is to put `CanQualType SingletonId;` on a separate line.



Comment at: clang/include/clang/AST/Type.h:1973
+  bool isWebAssemblyReferenceType() const;
+  bool isWebAssemblyExternrefType() const;
   /// Determines if this is a sizeless type supported by the

Looks like there should be a newline here.



Comment at: clang/include/clang/Basic/AddressSpaces.h:59-60
 
+  wasm_var,
+  wasm_externref,
   // This denotes the count of language-specific address spaces and also

What is `wasm_var`? It would be good to have a short comment here (and newline 
afterward).



Comment at: clang/include/clang/Basic/Builtins.def:50
 //  p -> pid_t
+//  e -> wasm externref
 //  . -> "...".  This may only occur at the end of the function list.





Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:193
+// Reference Types builtins
+// Some builtins are polymorphic - see 't' as part of the third argument,
+// in which case the argument spec (second argument) is unused.

Looks like this comment about polymorphism is out of date. (Also it would be 
good to add a newline after this.)



Comment at: clang/include/clang/Basic/WebAssemblyReferenceTypes.def:9
+//
+//  This file defines externref_t, funcref_t, and the like.  The macros are:
+//

Maybe we should only mention `externref_t` for now. 



Comment at: clang/lib/AST/ASTContext.cpp:2258
+Width = 0; 
\
+Align = 8; /* ? */ 
\
+break;

I assume things will break if you say 0 here, but would 1 work?



Comment at: clang/lib/AST/ASTContext.cpp:3984
+QualType ASTContext::getExternrefType() const {
+  if (Target->hasFeature("reference-types")) {
+#define WASM_REF_TYPE(Name, MangledName, Id, SingletonId, AS)  
\

Do we need `Target.getTriple().isWasm()` here as well?



Comment at: clang/lib/AST/ItaniumMangle.cpp:3147
+type_name = MangledName;   
\
+Out << (type_name == InternalName ? "u" : "") << type_name.size()  
\
+<< type_name;  
\

Our `MangledName` is not the same as our `InternalName`, so it looks like this 
condition will never be true. Should be follow the simpler pattern from the 
previous two targets instead?



Comment at: clang/lib/Basic/Targets/DirectX.h:44-45
+0, // ptr64
+1, // wasm_var
+10,// wasm_externref
 };

What are these for? I'm surprised we need to do anything here in the DirectX 
target. Same for the similar lines in other targets.



Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:806-807
+  SingletonId =
\
+  DBuilder.createForwardDecl(llvm::dwarf::DW_TAG_structure_type,   
\
+ MangledName, TheCU, TheCU->getFile(), 0); 
\
+return SingletonId;
\

How did you choose this?



Comment at: clang/lib/CodeGen/TargetInfo.h:353
+  /// Return the WebAssembly externref reference type.
+  virtual llvm::Type *getWasmExternrefReferenceType() const { return nullptr; }
   /// Emit the device-side copy of the builtin surface type.

missing whitespace.



Comment at: clang/test/CodeGen/WebAssembly/wasm-externref.c:11-13
+externref_t get_null() {
+  return __builtin_wasm_ref_null_extern();
+}

Do we need this here since the builtin is also tested in builtins-wasm.c? Are 
there more ways to use `externref_t` that we should test here?



Comment at: clang/test/Sema/wasm-refs-and-tables.c:3
+
+// Note: As WebAssembly references are sizeless types, we don't exhaustively
+// test for cases covered by sizeless-1.c and similar tests.

Should this file be just `wasm-refs.c` since it doesn't do anything with tables 
yet? Same for the next one.



Comment at: clang/test/SemaTemplate/address_space-dependent.cpp:46
 void tooBig() {
-  __attribute__((address_space(I))) int *bounds; // expected-error {{address 
space is larger than the

[PATCH] D131990: [DRAFT][WebAssembly] Do not support `[[clang::musttail]]` by default

2022-08-16 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

For reference, the existing error from the backend is something like this:

  test.cpp:9:5: error: WebAssembly 'tail-call' feature not enabled
  int foo(int x) {
  ^

Note that it points to the beginning of the callee rather than the specific 
line containing the tail call.

Ideally we would get the current fatal error and diagnostic message with the 
more specific context from this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131990/new/

https://reviews.llvm.org/D131990

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D128440: [WebAssembly] Initial support for reference type funcref in clang

2022-08-16 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

Oops, sorry about that. Will take a look.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128440/new/

https://reviews.llvm.org/D128440

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D131990: [DRAFT][WebAssembly] Do not support `[[clang::musttail]]` by default

2022-08-16 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
Herald added subscribers: pmatos, wingo, ecnelises, sunfish, jgravelle-google, 
sbc100, dschuff.
Herald added a reviewer: aaron.ballman.
Herald added a project: All.
tlively requested review of this revision.
Herald added subscribers: cfe-commits, aheejin.
Herald added a project: clang.

WebAssembly is not able to emit tail calls unless the `tail-call` target feature
is enabled and when it is not enabled, trying to compile musttail calls produces
a fatal error in the backend. To reflect this reality, disable support for the
`[[clang::musttail]]` attribute when targeting WebAssembly without the
`tail-call` feature.

Marked draft for further discussion because I'm not sure getting this:

  test.cpp:10:7: warning: unknown attribute 'musttail' ignored 
[-Wunknown-attributes]
  [[clang::musttail]] return bar(x * 10);

is actually better developer experience than getting a fatal error with a
description of the WebAssembly-specific problem. Users can also check for the
presence of the `__wasm_tail_call__` macro as an alternative to checking
`__has_cpp_attribute(clang::musttail)` with this patch, but I'm not sure that's
documented anywhere.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D131990

Files:
  clang/include/clang/Basic/Attr.td


Index: clang/include/clang/Basic/Attr.td
===
--- clang/include/clang/Basic/Attr.td
+++ clang/include/clang/Basic/Attr.td
@@ -414,6 +414,11 @@
 def TargetSupportsInitPriority : TargetSpec {
   let CustomCode = [{ !Target.getTriple().isOSzOS() }];
 }
+
+def TargetSupportsMustTail : TargetSpec {
+  let CustomCode = [{ !Target.getTriple().isWasm() || 
Target.hasFeature("tail-call") }];
+}
+
 // Attribute subject match rules that are used for #pragma clang attribute.
 //
 // A instance of AttrSubjectMatcherRule represents an individual match rule.
@@ -1433,7 +1438,7 @@
   let SimpleHandler = 1;
 }
 
-def MustTail : StmtAttr {
+def MustTail : StmtAttr, TargetSpecificAttr {
   let Spellings = [Clang<"musttail">];
   let Documentation = [MustTailDocs];
   let Subjects = SubjectList<[ReturnStmt], ErrorDiag, "return statements">;


Index: clang/include/clang/Basic/Attr.td
===
--- clang/include/clang/Basic/Attr.td
+++ clang/include/clang/Basic/Attr.td
@@ -414,6 +414,11 @@
 def TargetSupportsInitPriority : TargetSpec {
   let CustomCode = [{ !Target.getTriple().isOSzOS() }];
 }
+
+def TargetSupportsMustTail : TargetSpec {
+  let CustomCode = [{ !Target.getTriple().isWasm() || Target.hasFeature("tail-call") }];
+}
+
 // Attribute subject match rules that are used for #pragma clang attribute.
 //
 // A instance of AttrSubjectMatcherRule represents an individual match rule.
@@ -1433,7 +1438,7 @@
   let SimpleHandler = 1;
 }
 
-def MustTail : StmtAttr {
+def MustTail : StmtAttr, TargetSpecificAttr {
   let Spellings = [Clang<"musttail">];
   let Documentation = [MustTailDocs];
   let Subjects = SubjectList<[ReturnStmt], ErrorDiag, "return statements">;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D128282: [WebAssembly] Update test to run it in opaque pointers mode

2022-06-22 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

In D128282#3601100 , @asb wrote:

> In D128282#3600355 , @tlively wrote:
>
>> I think the lines still differ in that one tests wasm32 and the other tests 
>> wasm64 (the triples are different).
>
> Yeah, but the convention used elsewhere in this test file is to use 
> `WEBASSEMBLY:` (which is a common check prefix for both wasm32 and wasm64 run 
> lines) in the cases where the output is the same.

Ohh, for some reason I thought you were talking about the RUN lines. Yes, 
combining the check lines would make sense.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128282/new/

https://reviews.llvm.org/D128282

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D128282: [WebAssembly] Update test to run it in opaque pointers mode

2022-06-21 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.

I think the lines still differ in that one tests wasm32 and the other tests 
wasm64 (the triples are different).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128282/new/

https://reviews.llvm.org/D128282

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D127170: [WebAssembly] Implement remaining relaxed SIMD instructions

2022-06-08 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGaff679a48c43: [WebAssembly] Implement remaining relaxed SIMD 
instructions (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127170/new/

https://reviews.llvm.org/D127170

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -830,8 +830,13 @@
 # CHECK: f64x2.relaxed_max # encoding: [0xfd,0x90,0x02]
 f64x2.relaxed_max
 
-# TODO: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
-# TODO: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
-# TODO: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+# CHECK: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
+i16x8.relaxed_q15mulr_s
+
+# CHECK: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
+i16x8.dot_i8x16_i7x16_s
+
+# CHECK: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+i32x4.dot_i8x16_i7x16_add_s
 
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -368,6 +368,30 @@
   ret <8 x i16> %v
 }
 
+; CHECK-LABEL: relaxed_q15mulr_s_i16x8:
+; CHECK-NEXT: .functype relaxed_q15mulr_s_i16x8 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i16x8.relaxed_q15mulr_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <8 x i16> @llvm.wasm.relaxed.q15mulr.signed(<8 x i16>, <8 x i16>)
+define <8 x i16> @relaxed_q15mulr_s_i16x8(<8 x i16> %a, <8 x i16> %b) {
+  %v = call <8 x i16> @llvm.wasm.relaxed.q15mulr.signed(
+<8 x i16> %a, <8 x i16> %b
+  )
+  ret <8 x i16> %v
+}
+
+; CHECK-LABEL: dot_i8x16_i7x16_s_i16x8:
+; CHECK-NEXT: .functype dot_i8x16_i7x16_s_i16x8 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i16x8.dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
+define <8 x i16> @dot_i8x16_i7x16_s_i16x8(<16 x i8> %a, <16 x i8> %b) {
+  %v = call <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(
+<16 x i8> %a, <16 x i8> %b
+  )
+  ret <8 x i16> %v
+}
+
 ; ==
 ; 4 x i32
 ; ==
@@ -568,6 +592,20 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: dot_i8x16_i7x16_add_s_i32x4:
+; CHECK-NEXT: .functype dot_i8x16_i7x16_add_s_i32x4 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.dot_i8x16_i7x16_add_s $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.wasm.dot.i8x16.i7x16.add.signed(<16 x i8>, <16 x i8>,
+<4 x i32>)
+define <4 x i32> @dot_i8x16_i7x16_add_s_i32x4(<16 x i8> %a, <16 x i8> %b,
+  <4 x i32> %c) {
+  %v = call <4 x i32> @llvm.wasm.dot.i8x16.i7x16.add.signed(
+<16 x i8> %a, <16 x i8> %b, <4 x i32> %c
+  )
+  ret <4 x i32> %v
+}
+
 ; ==
 ; 2 x i64
 ; ==
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1403,7 +1403,6 @@
 defm "" : SIMDLANESELECT;
 defm "" : SIMDLANESELECT;
 
-
 //===--===//
 // Relaxed floating-point min and max.
 //===--===//
@@ -1426,3 +1425,30 @@
RelaxedBinary;
 defm SIMD_RELAXED_FMAX :
RelaxedBinary;
+
+//===--===//
+// Relaxed rounding q15 multiplication
+//===--===//
+
+defm RELAXED_Q15MULR_S :
+  RelaxedBinary;
+
+//===--===//
+// Relaxed integer dot product
+//===--===//
+
+defm RELAXED_DOT :
+  RELAXED_I<(outs V128:$dst), (ins V128:$lhs, V128:$rhs), (outs), (ins),
+

[PATCH] D127170: [WebAssembly] Implement remaining relaxed SIMD instructions

2022-06-07 Thread Thomas Lively via Phabricator via cfe-commits

tlively updated this revision to Diff 434968.
tlively added a comment.

- dot_add takes three operands


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127170/new/

https://reviews.llvm.org/D127170

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -830,8 +830,13 @@
 # CHECK: f64x2.relaxed_max # encoding: [0xfd,0x90,0x02]
 f64x2.relaxed_max
 
-# TODO: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
-# TODO: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
-# TODO: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+# CHECK: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
+i16x8.relaxed_q15mulr_s
+
+# CHECK: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
+i16x8.dot_i8x16_i7x16_s
+
+# CHECK: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+i32x4.dot_i8x16_i7x16_add_s
 
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -368,6 +368,30 @@
   ret <8 x i16> %v
 }
 
+; CHECK-LABEL: relaxed_q15mulr_s_i16x8:
+; CHECK-NEXT: .functype relaxed_q15mulr_s_i16x8 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i16x8.relaxed_q15mulr_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <8 x i16> @llvm.wasm.relaxed.q15mulr.signed(<8 x i16>, <8 x i16>)
+define <8 x i16> @relaxed_q15mulr_s_i16x8(<8 x i16> %a, <8 x i16> %b) {
+  %v = call <8 x i16> @llvm.wasm.relaxed.q15mulr.signed(
+<8 x i16> %a, <8 x i16> %b
+  )
+  ret <8 x i16> %v
+}
+
+; CHECK-LABEL: dot_i8x16_i7x16_s_i16x8:
+; CHECK-NEXT: .functype dot_i8x16_i7x16_s_i16x8 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i16x8.dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
+define <8 x i16> @dot_i8x16_i7x16_s_i16x8(<16 x i8> %a, <16 x i8> %b) {
+  %v = call <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(
+<16 x i8> %a, <16 x i8> %b
+  )
+  ret <8 x i16> %v
+}
+
 ; ==
 ; 4 x i32
 ; ==
@@ -568,6 +592,20 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: dot_i8x16_i7x16_add_s_i32x4:
+; CHECK-NEXT: .functype dot_i8x16_i7x16_add_s_i32x4 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.dot_i8x16_i7x16_add_s $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.wasm.dot.i8x16.i7x16.add.signed(<16 x i8>, <16 x i8>,
+<4 x i32>)
+define <4 x i32> @dot_i8x16_i7x16_add_s_i32x4(<16 x i8> %a, <16 x i8> %b,
+  <4 x i32> %c) {
+  %v = call <4 x i32> @llvm.wasm.dot.i8x16.i7x16.add.signed(
+<16 x i8> %a, <16 x i8> %b, <4 x i32> %c
+  )
+  ret <4 x i32> %v
+}
+
 ; ==
 ; 2 x i64
 ; ==
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1403,7 +1403,6 @@
 defm "" : SIMDLANESELECT;
 defm "" : SIMDLANESELECT;
 
-
 //===--===//
 // Relaxed floating-point min and max.
 //===--===//
@@ -1426,3 +1425,30 @@
RelaxedBinary;
 defm SIMD_RELAXED_FMAX :
RelaxedBinary;
+
+//===--===//
+// Relaxed rounding q15 multiplication
+//===--===//
+
+defm RELAXED_Q15MULR_S :
+  RelaxedBinary;
+
+//===--===//
+// Relaxed integer dot product
+//===--===//
+
+defm RELAXED_DOT :
+  RELAXED_I<(outs V128:$dst), (ins V128:$lhs, V128:$rhs), (outs), (ins),
+[(set (v8i16 V128:$dst), (int_wasm_dot_i8x16_i7x16_signed
+   (v16i8

[PATCH] D127170: [WebAssembly] Implement remaining relaxed SIMD instructions

2022-06-07 Thread Thomas Lively via Phabricator via cfe-commits

tlively planned changes to this revision.
tlively added inline comments.



Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:190
+TARGET_BUILTIN(__builtin_wasm_dot_i8x16_i7x16_s_i16x8, "V8sV16ScV16Sc", "nc", 
"relaxed-simd")
+TARGET_BUILTIN(__builtin_wasm_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16Sc", 
"nc", "relaxed-simd")
+

aheejin wrote:
> Doesn't `i32x4.dot_i8x16_i7x16_add_s` take three operands? 
> https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#relaxed-integer-dot-product
Oh wow, good catch  


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127170/new/

https://reviews.llvm.org/D127170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D127170: [WebAssembly] Implement remaining relaxed SIMD instructions

2022-06-06 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added a reviewer: aheejin.
Herald added subscribers: pmatos, asb, wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100, dschuff.
Herald added a project: All.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Add codegen, intrinsics, and builtins for the i16x8.relaxed_q15mulr_s,
i16x8.dot_i8x16_i7x16_s, and i32x4.dot_i8x16_i7x16_add_s instructions. These are
the last instructions from the relaxed SIMD proposal[1] that had not been
implemented.

[1]:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D127170

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -830,8 +830,13 @@
 # CHECK: f64x2.relaxed_max # encoding: [0xfd,0x90,0x02]
 f64x2.relaxed_max
 
-# TODO: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
-# TODO: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
-# TODO: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+# CHECK: i16x8.relaxed_q15mulr_s # encoding: [0xfd,0x91,0x02]
+i16x8.relaxed_q15mulr_s
+
+# CHECK: i16x8.dot_i8x16_i7x16_s # encoding: [0xfd,0x92,0x02]
+i16x8.dot_i8x16_i7x16_s
+
+# CHECK: i32x4.dot_i8x16_i7x16_add_s # encoding: [0xfd,0x93,0x02]
+i32x4.dot_i8x16_i7x16_add_s
 
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -368,6 +368,30 @@
   ret <8 x i16> %v
 }
 
+; CHECK-LABEL: relaxed_q15mulr_s_i16x8:
+; CHECK-NEXT: .functype relaxed_q15mulr_s_i16x8 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i16x8.relaxed_q15mulr_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <8 x i16> @llvm.wasm.relaxed.q15mulr.signed(<8 x i16>, <8 x i16>)
+define <8 x i16> @relaxed_q15mulr_s_i16x8(<8 x i16> %a, <8 x i16> %b) {
+  %v = call <8 x i16> @llvm.wasm.relaxed.q15mulr.signed(
+<8 x i16> %a, <8 x i16> %b
+  )
+  ret <8 x i16> %v
+}
+
+; CHECK-LABEL: dot_i8x16_i7x16_s_i16x8:
+; CHECK-NEXT: .functype dot_i8x16_i7x16_s_i16x8 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i16x8.dot_i8x16_i7x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(<16 x i8>, <16 x i8>)
+define <8 x i16> @dot_i8x16_i7x16_s_i16x8(<16 x i8> %a, <16 x i8> %b) {
+  %v = call <8 x i16> @llvm.wasm.dot.i8x16.i7x16.signed(
+<16 x i8> %a, <16 x i8> %b
+  )
+  ret <8 x i16> %v
+}
+
 ; ==
 ; 4 x i32
 ; ==
@@ -568,6 +592,18 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: dot_i8x16_i7x16_add_s_i32x4:
+; CHECK-NEXT: .functype dot_i8x16_i7x16_add_s_i32x4 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.dot_i8x16_i7x16_add_s $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.wasm.dot.i8x16.i7x16.add.signed(<16 x i8>, <16 x i8>)
+define <4 x i32> @dot_i8x16_i7x16_add_s_i32x4(<16 x i8> %a, <16 x i8> %b) {
+  %v = call <4 x i32> @llvm.wasm.dot.i8x16.i7x16.add.signed(
+<16 x i8> %a, <16 x i8> %b
+  )
+  ret <4 x i32> %v
+}
+
 ; ==
 ; 2 x i64
 ; ==
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1403,7 +1403,6 @@
 defm "" : SIMDLANESELECT;
 defm "" : SIMDLANESELECT;
 
-
 //===--===//
 // Relaxed floating-point min and max.
 //===--===//
@@ -1426,3 +1425,29 @@
RelaxedBinary;
 defm SIMD_RELAXED_FMAX :
RelaxedBinary;
+
+//===--===//
+// Relaxed rounding q15 multiplication
+//===--===//
+
+defm RELAXED_Q15MULR_S :
+  RelaxedBinary;
+

[PATCH] D123498: [clang] Adding Platform/Architecture Specific Resource Header Installation Targets

2022-04-11 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

No concerns about the WebAssembly changes. Will the default installation for 
folks who don't touch `LLVM_DISTRIBUTION_COMPONENTS` change?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123498/new/

https://reviews.llvm.org/D123498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121661: [WebAssembly] Fix names of SIMD instructions containing '_zero'

2022-03-16 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG7e8913d775ca: [WebAssembly] Fix names of SIMD instructions 
containing _zero (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121661/new/

https://reviews.llvm.org/D121661

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-conversions.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -313,8 +313,8 @@
 # CHECK: v128.load64_zero 32 # encoding: [0xfd,0x5d,0x03,0x20]
 v128.load64_zero 32
 
-# CHECK: f32x4.demote_zero_f64x2 # encoding: [0xfd,0x5e]
-f32x4.demote_zero_f64x2
+# CHECK: f32x4.demote_f64x2_zero # encoding: [0xfd,0x5e]
+f32x4.demote_f64x2_zero
 
 # CHECK: f64x2.promote_low_f32x4 # encoding: [0xfd,0x5f]
 f64x2.promote_low_f32x4
@@ -767,11 +767,11 @@
 # CHECK: f32x4.convert_i32x4_u # encoding: [0xfd,0xfb,0x01]
 f32x4.convert_i32x4_u
 
-# CHECK: i32x4.trunc_sat_zero_f64x2_s # encoding: [0xfd,0xfc,0x01]
-i32x4.trunc_sat_zero_f64x2_s
+# CHECK: i32x4.trunc_sat_f64x2_s_zero # encoding: [0xfd,0xfc,0x01]
+i32x4.trunc_sat_f64x2_s_zero
 
-# CHECK: i32x4.trunc_sat_zero_f64x2_u # encoding: [0xfd,0xfd,0x01]
-i32x4.trunc_sat_zero_f64x2_u
+# CHECK: i32x4.trunc_sat_f64x2_u_zero # encoding: [0xfd,0xfd,0x01]
+i32x4.trunc_sat_f64x2_u_zero
 
 # CHECK: f64x2.convert_low_i32x4_s # encoding: [0xfd,0xfe,0x01]
 f64x2.convert_low_i32x4_s
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -466,48 +466,48 @@
   ret <4 x i32> %a
 }
 
-; CHECK-LABEL: trunc_sat_zero_s_v4i32:
-; CHECK-NEXT: .functype trunc_sat_zero_s_v4i32 (v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.trunc_sat_zero_f64x2_s $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-LABEL: trunc_sat_s_zero_v4i32:
+; CHECK-NEXT: .functype trunc_sat_s_zero_v4i32 (v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.trunc_sat_f64x2_s_zero $push[[R:[0-9]+]]=, $0{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
 declare <2 x i32> @llvm.fptosi.sat.v2i32.v2f64(<2 x double>)
-define <4 x i32> @trunc_sat_zero_s_v4i32(<2 x double> %x) {
+define <4 x i32> @trunc_sat_s_zero_v4i32(<2 x double> %x) {
   %v = call <2 x i32> @llvm.fptosi.sat.v2i32.v2f64(<2 x double> %x)
   %a = shufflevector <2 x i32> %v, <2 x i32> ,
<4 x i32> 
   ret <4 x i32> %a
 }
 
-; CHECK-LABEL: trunc_sat_zero_s_v4i32_2:
-; CHECK-NEXT: .functype trunc_sat_zero_s_v4i32_2 (v128) -> (v128){{$}}
-; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_s $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-LABEL: trunc_sat_s_zero_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_s_zero_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_f64x2_s_zero $push[[R:[0-9]+]]=, $0{{$}}
 ; SLOW-NEXT: return $pop[[R]]{{$}}
 declare <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double>)
-define <4 x i32> @trunc_sat_zero_s_v4i32_2(<2 x double> %x) {
+define <4 x i32> @trunc_sat_s_zero_v4i32_2(<2 x double> %x) {
   %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
<4 x i32> 
   %a = call <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double> %v)
   ret <4 x i32> %a
 }
 
-; CHECK-LABEL: trunc_sat_zero_u_v4i32:
-; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32 (v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-LABEL: trunc_sat_u_zero_v4i32:
+; CHECK-NEXT: .functype trunc_sat_u_zero_v4i32 (v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.trunc_sat_f64x2_u_zero $push[[R:[0-9]+]]=, $0{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
 declare <2 x i32> @llvm.fptoui.sat.v2i32.v2f64(<2 x double>)
-define <4 x i32> @trunc_sat_zero_u_v4i32(<2 x double> %x) {
+define <4 x i32> @trunc_sat_u_zero_v4i32(<2 x double> %x) {
   %v = call <2 x i32> @llvm.fptoui.sat.v2i32.v2f64(<2 x double> %x)
   %a = shufflevector <2 x i32> %v, <2 x i32> ,
<4 x i32> 
   ret <4 x i32> %a
 }
 
-; CHECK-LABEL: trunc_sat_zero_u_v4i32_2:
-; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32_2 (v128) -> (v128){{$}}
-; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-LABEL: trunc_sat_u_zero_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_u_zero_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_f64x2_u_zero $push[[R:[0-9]+]]=, $0{{$}}
 ; SLOW-NEXT: return

[PATCH] D121661: [WebAssembly] Fix names of SIMD instructions containing '_zero'

2022-03-14 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
Herald added a project: All.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Fix the instruction names to match the WebAssembly spec:

- `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero`
- `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero`

Also rename related things like intrinsics, builtins, and test functions to
match.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D121661

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-conversions.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -313,8 +313,8 @@
 # CHECK: v128.load64_zero 32 # encoding: [0xfd,0x5d,0x03,0x20]
 v128.load64_zero 32
 
-# CHECK: f32x4.demote_zero_f64x2 # encoding: [0xfd,0x5e]
-f32x4.demote_zero_f64x2
+# CHECK: f32x4.demote_f64x2_zero # encoding: [0xfd,0x5e]
+f32x4.demote_f64x2_zero
 
 # CHECK: f64x2.promote_low_f32x4 # encoding: [0xfd,0x5f]
 f64x2.promote_low_f32x4
@@ -767,11 +767,11 @@
 # CHECK: f32x4.convert_i32x4_u # encoding: [0xfd,0xfb,0x01]
 f32x4.convert_i32x4_u
 
-# CHECK: i32x4.trunc_sat_zero_f64x2_s # encoding: [0xfd,0xfc,0x01]
-i32x4.trunc_sat_zero_f64x2_s
+# CHECK: i32x4.trunc_sat_f64x2_s_zero # encoding: [0xfd,0xfc,0x01]
+i32x4.trunc_sat_f64x2_s_zero
 
-# CHECK: i32x4.trunc_sat_zero_f64x2_u # encoding: [0xfd,0xfd,0x01]
-i32x4.trunc_sat_zero_f64x2_u
+# CHECK: i32x4.trunc_sat_f64x2_u_zero # encoding: [0xfd,0xfd,0x01]
+i32x4.trunc_sat_f64x2_u_zero
 
 # CHECK: f64x2.convert_low_i32x4_s # encoding: [0xfd,0xfe,0x01]
 f64x2.convert_low_i32x4_s
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -466,48 +466,48 @@
   ret <4 x i32> %a
 }
 
-; CHECK-LABEL: trunc_sat_zero_s_v4i32:
-; CHECK-NEXT: .functype trunc_sat_zero_s_v4i32 (v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.trunc_sat_zero_f64x2_s $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-LABEL: trunc_sat_s_zero_v4i32:
+; CHECK-NEXT: .functype trunc_sat_s_zero_v4i32 (v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.trunc_sat_f64x2_s_zero $push[[R:[0-9]+]]=, $0{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
 declare <2 x i32> @llvm.fptosi.sat.v2i32.v2f64(<2 x double>)
-define <4 x i32> @trunc_sat_zero_s_v4i32(<2 x double> %x) {
+define <4 x i32> @trunc_sat_s_zero_v4i32(<2 x double> %x) {
   %v = call <2 x i32> @llvm.fptosi.sat.v2i32.v2f64(<2 x double> %x)
   %a = shufflevector <2 x i32> %v, <2 x i32> ,
<4 x i32> 
   ret <4 x i32> %a
 }
 
-; CHECK-LABEL: trunc_sat_zero_s_v4i32_2:
-; CHECK-NEXT: .functype trunc_sat_zero_s_v4i32_2 (v128) -> (v128){{$}}
-; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_s $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-LABEL: trunc_sat_s_zero_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_s_zero_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_f64x2_s_zero $push[[R:[0-9]+]]=, $0{{$}}
 ; SLOW-NEXT: return $pop[[R]]{{$}}
 declare <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double>)
-define <4 x i32> @trunc_sat_zero_s_v4i32_2(<2 x double> %x) {
+define <4 x i32> @trunc_sat_s_zero_v4i32_2(<2 x double> %x) {
   %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
<4 x i32> 
   %a = call <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double> %v)
   ret <4 x i32> %a
 }
 
-; CHECK-LABEL: trunc_sat_zero_u_v4i32:
-; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32 (v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-LABEL: trunc_sat_u_zero_v4i32:
+; CHECK-NEXT: .functype trunc_sat_u_zero_v4i32 (v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.trunc_sat_f64x2_u_zero $push[[R:[0-9]+]]=, $0{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
 declare <2 x i32> @llvm.fptoui.sat.v2i32.v2f64(<2 x double>)
-define <4 x i32> @trunc_sat_zero_u_v4i32(<2 x double> %x) {
+define <4 x i32> @trunc_sat_u_zero_v4i32(<2 x double> %x) {
   %v = call <2 x i32> @llvm.fptoui.sat.v2i32.v2f64(<2 x double> %x)
   %a = shufflevector <2 x i32> %v, <2 x i32> ,
<4 x i32> 
   ret <4 x i32> %a
 }
 
-; CHECK-LABEL: trunc_sat_zero_u_v4i32_2:
-; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32_2 (v128) ->

[PATCH] D121014: [WebAssembly] Check bulk-memory when adjusting lang opts

2022-03-04 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

@dyung, this should be fixed by 
https://reviews.llvm.org/rGc01ec3083026f7e24e6c06f48a05d413e2c697d4


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121014/new/

https://reviews.llvm.org/D121014

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D121014: [WebAssembly] Check bulk-memory when adjusting lang opts

2022-03-04 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG3be9e0ba972c: [WebAssembly] Check bulk-memory when adjusting 
lang opts (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121014/new/

https://reviews.llvm.org/D121014

Files:
  clang/lib/Basic/Targets/WebAssembly.cpp
  clang/test/CodeGenCXX/static-init-wasm.cpp


Index: clang/test/CodeGenCXX/static-init-wasm.cpp
===
--- clang/test/CodeGenCXX/static-init-wasm.cpp
+++ clang/test/CodeGenCXX/static-init-wasm.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature 
+atomics -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature 
+atomics -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=WEBASSEMBLY32
-// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature 
+atomics -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature 
+atomics -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=WEBASSEMBLY64
 
 // Test that we don't create common blocks.
@@ -53,9 +53,9 @@
 // WEBASSEMBLY64: define internal void @_GLOBAL__sub_I_static_init_wasm.cpp() 
#3 {
 // WEBASSEMBLY64: call void @__cxx_global_var_init()
 
-// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature 
+bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=NOATOMICS
-// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature 
+bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=NOATOMICS
 
 // NOATOMICS-LABEL: @_Z1gv()
@@ -66,3 +66,17 @@
 // NOATOMICS-NOT:   __cxa_guard_acquire
 // NOATOMICS:   [[END]]:
 // NOATOMICS-NEXT:  ret void
+
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature 
+atomics -o - %s \
+// RUN:   | FileCheck %s -check-prefix=NOBULKMEM
+// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature 
+atomics -o - %s \
+// RUN:   | FileCheck %s -check-prefix=NOBULKMEM
+
+// NOBULKMEM-LABEL: @_Z1gv()
+// NOBULKMEM:   %[[R0:.+]] = load i8, i8* @_ZGVZ1gvE1a, align 1
+// NOBULKMEM-NEXT:  %guard.uninitialized = icmp eq i8 %[[R0]], 0
+// NOBULKMEM-NEXT:  br i1 %guard.uninitialized, label %[[CHECK:.+]], label 
%[[END:.+]],
+// NOBULKMEM:   [[CHECK]]:
+// NOBULKMEM-NOT:   __cxa_guard_acquire
+// NOBULKMEM:   [[END]]:
+// NOBULKMEM-NEXT:  ret void
Index: clang/lib/Basic/Targets/WebAssembly.cpp
===
--- clang/lib/Basic/Targets/WebAssembly.cpp
+++ clang/lib/Basic/Targets/WebAssembly.cpp
@@ -256,9 +256,10 @@
 void WebAssemblyTargetInfo::adjust(DiagnosticsEngine ,
LangOptions ) {
   TargetInfo::adjust(Diags, Opts);
-  // If the Atomics feature isn't available, turn off POSIXThreads and
-  // ThreadModel, so that we don't predefine _REENTRANT or __STDCPP_THREADS__.
-  if (!HasAtomics) {
+  // Turn off POSIXThreads and ThreadModel so that we don't predefine 
_REENTRANT
+  // or __STDCPP_THREADS__ if we will eventually end up stripping atomics
+  // because they are unsupported.
+  if (!HasAtomics || !HasBulkMemory) {
 Opts.POSIXThreads = false;
 Opts.setThreadModel(LangOptions::ThreadModelKind::Single);
 Opts.ThreadsafeStatics = false;


Index: clang/test/CodeGenCXX/static-init-wasm.cpp
===
--- clang/test/CodeGenCXX/static-init-wasm.cpp
+++ clang/test/CodeGenCXX/static-init-wasm.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature +atomics -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature +atomics -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=WEBASSEMBLY32
-// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature +atomics -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature +atomics -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=WEBASSEMBLY64
 
 // Test that we don't create common blocks.
@@ -53,9 +53,9 @@
 // WEBASSEMBLY64: define internal void @_GLOBAL__sub_I_static_init_wasm.cpp() #3 {
 // WEBASSEMBLY64: call void @__cxx_global_var_init()
 
-// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=NOATOMICS
-// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -o - %s \
+// RUN: %clang_cc1 -emit-llvm

[PATCH] D121014: [WebAssembly] Check bulk-memory when adjusting lang opts

2022-03-04 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added a reviewer: sbc100.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, dschuff.
Herald added a project: All.
tlively requested review of this revision.
Herald added subscribers: cfe-commits, aheejin.
Herald added a project: clang.

We previously had logic to disable pthreads, set the ThreadModel to Single, and
disable thread-safe statics when the atomics target features is disabled, since
that means that the resulting program will not be used in a threaded context.
Similarly check for the presence of the bulk-memory feature, since that is also
necessary to produce multithreaded programs.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D121014

Files:
  clang/lib/Basic/Targets/WebAssembly.cpp
  clang/test/CodeGenCXX/static-init-wasm.cpp


Index: clang/test/CodeGenCXX/static-init-wasm.cpp
===
--- clang/test/CodeGenCXX/static-init-wasm.cpp
+++ clang/test/CodeGenCXX/static-init-wasm.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature 
+atomics -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature 
+atomics -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=WEBASSEMBLY32
-// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature 
+atomics -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature 
+atomics -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=WEBASSEMBLY64
 
 // Test that we don't create common blocks.
@@ -53,9 +53,9 @@
 // WEBASSEMBLY64: define internal void @_GLOBAL__sub_I_static_init_wasm.cpp() 
#3 {
 // WEBASSEMBLY64: call void @__cxx_global_var_init()
 
-// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature 
+bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=NOATOMICS
-// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature 
+bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=NOATOMICS
 
 // NOATOMICS-LABEL: @_Z1gv()
@@ -66,3 +66,17 @@
 // NOATOMICS-NOT:   __cxa_guard_acquire
 // NOATOMICS:   [[END]]:
 // NOATOMICS-NEXT:  ret void
+
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature 
+atomics -o - %s \
+// RUN:   | FileCheck %s -check-prefix=NOBULKMEM
+// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature 
+atomics -o - %s \
+// RUN:   | FileCheck %s -check-prefix=NOBULKMEM
+
+// NOBULKMEM-LABEL: @_Z1gv()
+// NOBULKMEM:   %[[R0:.+]] = load i8, i8* @_ZGVZ1gvE1a, align 1
+// NOBULKMEM-NEXT:  %guard.uninitialized = icmp eq i8 %[[R0]], 0
+// NOBULKMEM-NEXT:  br i1 %guard.uninitialized, label %[[CHECK:.+]], label 
%[[END:.+]],
+// NOBULKMEM:   [[CHECK]]:
+// NOBULKMEM-NOT:   __cxa_guard_acquire
+// NOBULKMEM:   [[END]]:
+// NOBULKMEM-NEXT:  ret void
Index: clang/lib/Basic/Targets/WebAssembly.cpp
===
--- clang/lib/Basic/Targets/WebAssembly.cpp
+++ clang/lib/Basic/Targets/WebAssembly.cpp
@@ -256,9 +256,10 @@
 void WebAssemblyTargetInfo::adjust(DiagnosticsEngine ,
LangOptions ) {
   TargetInfo::adjust(Diags, Opts);
-  // If the Atomics feature isn't available, turn off POSIXThreads and
-  // ThreadModel, so that we don't predefine _REENTRANT or __STDCPP_THREADS__.
-  if (!HasAtomics) {
+  // Turn off POSIXThreads and ThreadModel so that we don't predefine 
_REENTRANT
+  // or __STDCPP_THREADS__ if we will eventually end up stripping atomics
+  // because they are unsupported.
+  if (!HasAtomics || !HasBulkMemory) {
 Opts.POSIXThreads = false;
 Opts.setThreadModel(LangOptions::ThreadModelKind::Single);
 Opts.ThreadsafeStatics = false;


Index: clang/test/CodeGenCXX/static-init-wasm.cpp
===
--- clang/test/CodeGenCXX/static-init-wasm.cpp
+++ clang/test/CodeGenCXX/static-init-wasm.cpp
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature +atomics -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm32-unknown-unknown -target-feature +atomics -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=WEBASSEMBLY32
-// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature +atomics -o - %s \
+// RUN: %clang_cc1 -emit-llvm -triple=wasm64-unknown-unknown -target-feature +atomics -target-feature +bulk-memory -o - %s \
 // RUN:   | FileCheck %s -check-prefix=WEBASSEMBLY64
 
 // Test that we don't create common blocks.
@@ -53,9 +53,9 @@
 // WEBASSEMBLY64: define internal void @_GLOBAL__sub_I_static_init_wasm.cpp() #3 {
 // WEBASSEMBLY64: call void

[PATCH] D112186: [WebAssembly] Add prototype relaxed float to int trunc instructions

2021-10-28 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGfb67f3d96980: [WebAssembly] Add prototype relaxed float to 
int trunc instructions (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112186/new/

https://reviews.llvm.org/D112186

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -818,4 +818,16 @@
 # CHECK: f64x2.relaxed_max # encoding: [0xfd,0xee,0x01]
 f64x2.relaxed_max
 
+# CHECK: i32x4.relaxed_trunc_f32x4_s # encoding: [0xfd,0xa5,0x01]
+i32x4.relaxed_trunc_f32x4_s
+
+# CHECK: i32x4.relaxed_trunc_f32x4_u # encoding: [0xfd,0xa6,0x01]
+i32x4.relaxed_trunc_f32x4_u
+
+# CHECK: i32x4.relaxed_trunc_f64x2_s_zero # encoding: [0xfd,0xc5,0x01]
+i32x4.relaxed_trunc_f64x2_s_zero
+
+# CHECK: i32x4.relaxed_trunc_f64x2_u_zero # encoding: [0xfd,0xc6,0x01]
+i32x4.relaxed_trunc_f64x2_u_zero
+
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -526,6 +526,48 @@
   ret <4 x i32> %v
 }
 
+; CHECK-LABEL: relaxed_trunc_s:
+; NO-CHECK-NOT: f32x4
+; CHECK-NEXT: .functype relaxed_trunc_s (v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.relaxed_trunc_f32x4_s $push[[R:[0-9]+]]=, $0
+; CHECK-NEXT: return $pop[[R]]
+declare <4 x i32> @llvm.wasm.relaxed.trunc.signed(<4 x float>)
+define <4 x i32> @relaxed_trunc_s(<4 x float> %x) {
+  %a = call <4 x i32> @llvm.wasm.relaxed.trunc.signed(<4 x float> %x)
+  ret <4 x i32> %a
+}
+
+; CHECK-LABEL: relaxed_trunc_u:
+; NO-CHECK-NOT: f32x4
+; CHECK-NEXT: .functype relaxed_trunc_u (v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.relaxed_trunc_f32x4_u $push[[R:[0-9]+]]=, $0
+; CHECK-NEXT: return $pop[[R]]
+declare <4 x i32> @llvm.wasm.relaxed.trunc.unsigned(<4 x float>)
+define <4 x i32> @relaxed_trunc_u(<4 x float> %x) {
+  %a = call <4 x i32> @llvm.wasm.relaxed.trunc.unsigned(<4 x float> %x)
+  ret <4 x i32> %a
+}
+
+; CHECK-LABEL: relaxed_trunc_zero_s:
+; CHECK-NEXT: .functype relaxed_trunc_zero_s (v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.relaxed_trunc_f64x2_s_zero $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.wasm.relaxed.trunc.zero.signed(<2 x double>)
+define <4 x i32> @relaxed_trunc_zero_s(<2 x double> %x) {
+  %a = call <4 x i32> @llvm.wasm.relaxed.trunc.zero.signed(<2 x double> %x)
+  ret <4 x i32> %a
+}
+
+; CHECK-LABEL: relaxed_trunc_zero_u:
+; CHECK-NEXT: .functype relaxed_trunc_zero_u (v128) -> (v128){{$}}
+; CHECK-NEXT: i32x4.relaxed_trunc_f64x2_u_zero $push[[R:[0-9]+]]=, $0{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.wasm.relaxed.trunc.zero.unsigned(<2 x double>)
+define <4 x i32> @relaxed_trunc_zero_u(<2 x double> %x) {
+  %a = call <4 x i32> @llvm.wasm.relaxed.trunc.zero.unsigned(<2 x double> %x)
+  ret <4 x i32> %a
+}
+
 ; ==
 ; 2 x i64
 ; ==
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1392,3 +1392,20 @@
 
 defm "" : SIMD_RELAXED_FMINMAX;
 defm "" : SIMD_RELAXED_FMINMAX;
+
+//===--===//
+// Relaxed floating-point to int conversions
+//===--===//
+
+multiclass SIMD_RELAXED_CONVERT simdop> {
+  defm op#_#vec :
+RELAXED_I<(outs V128:$dst), (ins V128:$vec), (outs), (ins),
+  [(set (vec.vt V128:$dst), (vec.vt (op (arg.vt V128:$vec],
+  vec.prefix#"."#name#"\t$dst, $vec", vec.prefix#"."#name, simdop>;
+}
+
+defm "" : SIMD_RELAXED_CONVERT;
+defm "" : SIMD_RELAXED_CONVERT;
+
+defm "" : SIMD_RELAXED_CONVERT;
+defm "" : SIMD_RELAXED_CONVERT;
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -214,6 +214,27 @@
 [LLVMMatchType<0>, LLVMMatchType<0>],
 [IntrNoMem, IntrSpeculatable]>;
 
+def

[PATCH] D112186: [WebAssembly] Add prototype relaxed float to int trunc instructions

2021-10-28 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

LGTM, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112186/new/

https://reviews.llvm.org/D112186

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D112186: [WebAssembly] Add prototype relaxed float to int trunc instructions

2021-10-26 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:18373-18374
+llvm::Type *SrcT = Vec->getType();
+llvm::Type *TruncT =
+FixedVectorType::get(llvm::IntegerType::get(getLLVMContext(), 32), 4);
+Function *Callee = CGM.getIntrinsic(IntNo, {TruncT, SrcT});

This can be `ConvertType(E->getType())` to calculate the LLVM type from the 
clang type for the call. However, it would be even better to not make these 
intrinsics polymorphic.



Comment at: llvm/include/llvm/IR/IntrinsicsWebAssembly.td:218-219
+def int_wasm_relaxed_trunc_signed:
+  Intrinsic<[llvm_anyvector_ty],
+[llvm_anyvector_ty],
+[IntrNoMem, IntrSpeculatable]>;

Since these intrinsics only need to work with a single set of types each, it 
would be simpler to specify those types rather than make the intrinsics 
polymorphic. See the definition of `int_wasm_q15mulr_sat_signed` for an 
example. That would allow you to drop the type arguments from 
`CGM.getIntrinsic()` in CGBuiltin.cpp.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112186/new/

https://reviews.llvm.org/D112186

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D112186: [WebAssembly] Add prototype relaxed float to int trunc instructions

2021-10-26 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:18262-18267
+case WebAssembly::BI__builtin_wasm_relaxed_trunc_zero_s_i32x4_f64x2:
+  IntNo = Intrinsic::wasm_relaxed_trunc_zero_signed;
+  break;
+case WebAssembly::BI__builtin_wasm_relaxed_trunc_zero_u_i32x4_f64x2:
+  IntNo = Intrinsic::wasm_relaxed_trunc_zero_unsigned;
+  break;

ngzhian wrote:
> @tlively I'm having trouble with this, getting this stack trace
> 
> ```
> WidenVectorResult #0: t4: v2i32 = llvm.wasm.relaxed.trunc.zero.signed 
> TargetConstant:i32<9112>, t2
> 
> Do not know how to widen the result of this operator!
> UNREACHABLE executed at 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp:3035!
> PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash 
> backtrace.
> Stack dump:
> 0.  Program arguments: 
> /usr/local/google/home/zhin/ssd2/llvm-project/build-wasm/bin/llc 
> -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt 
> -wasm-disable-explicit-locals -wasm-keep-registers 
> -mattr=+simd128,+relaxed-simd -debug
> 1.  Running pass 'Function Pass Manager' on module ''.
> 2.  Running pass 'WebAssembly Instruction Selection' on function 
> '@relaxed_trunc_zero_s_v4i32'
>  #0 0x7f3012db05bb llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/Support/Unix/Signals.inc:565:22
>  #1 0x7f3012db0672 PrintStackTraceSignalHandler(void*) 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
>  #2 0x7f3012dae668 llvm::sys::RunSignalHandlers() 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/Support/Signals.cpp:97:20
>  #3 0x7f3012db000e SignalHandler(int) 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
>  #4 0x7f301276fef0 (/lib/x86_64-linux-gnu/libc.so.6+0x3cef0)
>  #5 0x7f301276fe71 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:50:1
>  #6 0x7f3012759536 abort ./stdlib/abort.c:81:7
>  #7 0x7f3012c5a974 bindingsErrorHandler(void*, char const*, bool) 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/Support/ErrorHandling.cpp:218:55
>  #8 0x7f3016e7856c 
> llvm::DAGTypeLegalizer::WidenVectorResult(llvm::SDNode*, unsigned int) 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp:3037:71
>  #9 0x7f3016e4a338 llvm::DAGTypeLegalizer::run() 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:280:17
> #10 0x7f3016e4e34f llvm::SelectionDAG::LegalizeTypes() 
> /usr/local/google/home/zhin/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:1055:37
> ```
> 
> Do I need to add some stuff to LegalizeTypes?
Sorry for the delay in review! The issue is that the code below generates the 
intrinsic calls with return type <2 x i32>, which is not a legal type, as well 
as a shuffle to get back to <4 x i32>. For the existing builtins, we use 
target-independent LLVM intrinsics, so LLVM already knows how to expand them, 
but it doesn't know how to expand these new target-specific intrinsics. Rather 
than piggybacking on the  existing logic here, it would be good to generate 
intrinsics that return the expected <4 x i32> type (and no shuffle) so there is 
nothing to expand.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112186/new/

https://reviews.llvm.org/D112186

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D112146: [WebAssembly] Add prototype relaxed float min max instructions

2021-10-20 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe1fb13401e1b: [WebAssembly] Add prototype relaxed float min 
max instructions (authored by ngzhian, committed by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112146/new/

https://reviews.llvm.org/D112146

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -806,4 +806,16 @@
 # CHECK: i8x16.relaxed_swizzle # encoding: [0xfd,0xa2,0x01]
 i8x16.relaxed_swizzle
 
+# CHECK: f32x4.relaxed_min # encoding: [0xfd,0xb4,0x01]
+f32x4.relaxed_min
+
+# CHECK: f32x4.relaxed_max # encoding: [0xfd,0xe2,0x01]
+f32x4.relaxed_max
+
+# CHECK: f64x2.relaxed_min # encoding: [0xfd,0xd4,0x01]
+f64x2.relaxed_min
+
+# CHECK: f64x2.relaxed_max # encoding: [0xfd,0xee,0x01]
+f64x2.relaxed_max
+
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -682,6 +682,30 @@
   ret <4 x float> %v
 }
 
+; CHECK-LABEL: relaxed_min_v4f32:
+; CHECK-NEXT: .functype relaxed_min_v4f32 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.relaxed_min $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.relaxed.min.v4f32(<4 x float>, <4 x float>)
+define <4 x float> @relaxed_min_v4f32(<4 x float> %a, <4 x float> %b) {
+  %v = call <4 x float> @llvm.wasm.relaxed.min.v4f32(
+<4 x float> %a, <4 x float> %b
+  )
+  ret <4 x float> %v
+}
+
+; CHECK-LABEL: relaxed_max_v4f32:
+; CHECK-NEXT: .functype relaxed_max_v4f32 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.relaxed_max $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.relaxed.max.v4f32(<4 x float>, <4 x float>)
+define <4 x float> @relaxed_max_v4f32(<4 x float> %a, <4 x float> %b) {
+  %v = call <4 x float> @llvm.wasm.relaxed.max.v4f32(
+<4 x float> %a, <4 x float> %b
+  )
+  ret <4 x float> %v
+}
+
 ; ==
 ; 2 x f64
 ; ==
@@ -780,3 +804,27 @@
   )
   ret <2 x double> %v
 }
+
+; CHECK-LABEL: relaxed_min_v2f64:
+; CHECK-NEXT: .functype relaxed_min_v2f64 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.relaxed_min $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.relaxed.min.v2f64(<2 x double>, <2 x double>)
+define <2 x double> @relaxed_min_v2f64(<2 x double> %a, <2 x double> %b) {
+  %v = call <2 x double> @llvm.wasm.relaxed.min.v2f64(
+<2 x double> %a, <2 x double> %b
+  )
+  ret <2 x double> %v
+}
+
+; CHECK-LABEL: relaxed_max_v2f64:
+; CHECK-NEXT: .functype relaxed_max_v2f64 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.relaxed_max $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.relaxed.max.v2f64(<2 x double>, <2 x double>)
+define <2 x double> @relaxed_max_v2f64(<2 x double> %a, <2 x double> %b) {
+  %v = call <2 x double> @llvm.wasm.relaxed.max.v2f64(
+<2 x double> %a, <2 x double> %b
+  )
+  ret <2 x double> %v
+}
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1372,3 +1372,23 @@
  [(set (v16i8 V128:$dst),
(int_wasm_relaxed_swizzle (v16i8 V128:$src), (v16i8 V128:$mask)))],
  "i8x16.relaxed_swizzle\t$dst, $src, $mask", "i8x16.relaxed_swizzle", 162>;
+
+//===--===//
+// Relaxed floating-point min and max.
+//===--===//
+
+multiclass SIMD_RELAXED_FMINMAX simdopMin, bits<32> simdopMax> {
+  defm RELAXED_FMIN_#vec :
+RELAXED_I<(outs V128:$dst), (ins V128:$a, V128:$b), (outs), (ins),
+  [(set (vec.vt V128:$dst), (int_wasm_relaxed_min
+(vec.vt V128:$a), (vec.vt V128:$b)))],
+  vec.prefix#".relaxed_min\t$dst, $a, $b", vec.prefix#".relaxed_min", simdopMin>;
+  defm RELAXED_FMAX_#vec :
+RELAXED_I<(outs V128:$dst), (ins V128:$a,

[PATCH] D112146: [WebAssembly] Add prototype relaxed float min max instructions

2021-10-20 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

LGTM!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112146/new/

https://reviews.llvm.org/D112146

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D111154: [WebAssembly] Implementation of table.get/set for reftypes in LLVM IR

2021-10-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

Nice! Just a couple nits but I think this is good to go.




Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1478-1479
+  } else {
+GA = dyn_cast(Base->getOperand(0));
+if (!GA) {
+  // This might be Case 1 above (or an error)

It would be nice to switch the `if` arms around so the `else` corresponds with 
the null case.



Comment at: llvm/lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp:72-75
+GlobalVT->getArrayElementType()->getPointerAddressSpace() ==
+WebAssembly::WasmAddressSpace::WASM_ADDRESS_SPACE_FUNCREF
+? MVT::funcref
+: MVT::externref;

It would be good to explicitly check the externref address space as well so we 
can error out if we get an unexpected address space.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54/new/

https://reviews.llvm.org/D54

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D112022: [WebAssembly] Add prototype relaxed swizzle instructions

2021-10-19 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG2542bfa43a97: [WebAssembly] Add prototype relaxed swizzle 
instructions (authored by ngzhian, committed by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112022/new/

https://reviews.llvm.org/D112022

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -803,4 +803,7 @@
 # CHECK: i64x2.laneselect # encoding: [0xfd,0xd3,0x01]
 i64x2.laneselect
 
+# CHECK: i8x16.relaxed_swizzle # encoding: [0xfd,0xa2,0x01]
+i8x16.relaxed_swizzle
+
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -192,6 +192,16 @@
   ret <16 x i8> %v
 }
 
+; CHECK-LABEL: relaxed_swizzle_v16i8:
+; CHECK-NEXT: .functype relaxed_swizzle_v16i8 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: i8x16.relaxed_swizzle $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <16 x i8> @llvm.wasm.relaxed.swizzle(<16 x i8>, <16 x i8>)
+define <16 x i8> @relaxed_swizzle_v16i8(<16 x i8> %x, <16 x i8> %y) {
+  %a = call <16 x i8> @llvm.wasm.relaxed.swizzle(<16 x i8> %x, <16 x i8> %y)
+  ret <16 x i8> %a
+}
+
 ; ==
 ; 8 x i16
 ; ==
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1361,3 +1361,14 @@
 defm "" : SIMDLANESELECT;
 defm "" : SIMDLANESELECT;
 defm "" : SIMDLANESELECT;
+
+
+//===--===//
+// Relaxed swizzle
+//===--===//
+
+defm RELAXED_SWIZZLE :
+  RELAXED_I<(outs V128:$dst), (ins V128:$src, V128:$mask), (outs), (ins),
+ [(set (v16i8 V128:$dst),
+   (int_wasm_relaxed_swizzle (v16i8 V128:$src), (v16i8 V128:$mask)))],
+ "i8x16.relaxed_swizzle\t$dst, $src, $mask", "i8x16.relaxed_swizzle", 162>;
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -200,6 +200,11 @@
 [LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>],
 [IntrNoMem, IntrSpeculatable]>;
 
+def int_wasm_relaxed_swizzle :
+  Intrinsic<[llvm_v16i8_ty],
+[llvm_v16i8_ty, llvm_v16i8_ty],
+[IntrNoMem, IntrSpeculatable]>;
+
 //===--===//
 // Thread-local storage intrinsics
 //===--===//
Index: clang/test/CodeGen/builtins-wasm.c
===
--- clang/test/CodeGen/builtins-wasm.c
+++ clang/test/CodeGen/builtins-wasm.c
@@ -732,3 +732,8 @@
   // WEBASSEMBLY-SAME: <2 x i64> %a, <2 x i64> %b, <2 x i64> %c)
   // WEBASSEMBLY-NEXT: ret
 }
+
+i8x16 relaxed_swizzle_i8x16(i8x16 x, i8x16 y) {
+  return __builtin_wasm_relaxed_swizzle_i8x16(x, y);
+  // WEBASSEMBLY: call <16 x i8> @llvm.wasm.relaxed.swizzle(<16 x i8> %x, <16 x i8> %y)
+}
Index: clang/lib/CodeGen/CGBuiltin.cpp
===
--- clang/lib/CodeGen/CGBuiltin.cpp
+++ clang/lib/CodeGen/CGBuiltin.cpp
@@ -18319,6 +18319,12 @@
 CGM.getIntrinsic(Intrinsic::wasm_laneselect, A->getType());
 return Builder.CreateCall(Callee, {A, B, C});
   }
+  case WebAssembly::BI__builtin_wasm_relaxed_swizzle_i8x16: {
+Value *Src = EmitScalarExpr(E->getArg(0));
+Value *Indices = EmitScalarExpr(E->getArg(1));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_relaxed_swizzle);
+return Builder.CreateCall(Callee, {Src, Indices});
+  }
   default:
 return nullptr;
   }
Index: clang/include/clang/Basic/BuiltinsWebAssembly.def
===
--- clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -172,5 +172,7 @@

[PATCH] D112022: [WebAssembly] Add prototype relaxed swizzle instructions

2021-10-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

Thanks! I'll go ahead and land this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112022/new/

https://reviews.llvm.org/D112022

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D112022: [WebAssembly] Add prototype relaxed swizzle instructions

2021-10-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.

Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td:1365
+
+def wasm_relaxed_swizzle : SDNode<"WebAssemblyISD::RELAXED_SWIZZLE", 
wasm_swizzle_t>;
+

ngzhian wrote:
> @tlively i'm not 100% sure if this is needed or the right thing to do, i 
> looked at what i8x16.swizzle currently does and just replaced the name and 
> opcode. LMK if this needs to be changed.
`HANDLE_NODETYPE(RELAXED_SWIZZLE)`

and

`def wasm_relaxed_swizzle : SDNode<"WebAssemblyISD::RELAXED_SWIZZLE", 
wasm_swizzle_t>;`

Are both necessary when we do more interesting codegen optimizations from the 
C++, but in this case we want to just directly select the intrinsic, so they 
aren't necessary. For the pattern inside the definition of `RELAXED_SWIZZLE` 
below, you can just use `int_wasm_relaxed_swizzle` rather than defining and 
using the separate `wasm_relaxed_swizzle`. That will let you get rid of the 
separate `Pat` below as well.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112022/new/

https://reviews.llvm.org/D112022

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D111154: [WebAssembly] Implementation of table.get/set for reftypes in LLVM IR

2021-10-17 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

This is looking good! I'll take a more thorough pass through tomorrow so we can 
get this landed.




Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1455
+  const SDValue ,
+  GlobalAddressSDNode **GA,
+  SDValue ) const {

Would it make sense to use `GlobalAddressSDNode *` here to match using a 
reference for the `Idx` out-param? That would slightly simplify the code below 
as well.



Comment at: llvm/test/CodeGen/WebAssembly/funcref-table_call.ll:25-27
+; CHECK-NEXT: i32.const   0
+; CHECK-NEXT: ref.nullfunc
+; CHECK-NEXT: table.set   __funcref_call_table

pmatos wrote:
> tlively wrote:
> > Do you think it would be a good idea to skip setting this table slot back 
> > to null? I know we discussed this before, but I don't remember what the 
> > pros and cons were.
> So the reason to do this was that if we leave the funcref in the slot, this 
> could create hidden GC roots. Comes from this comment by @wingo : 
> https://reviews.llvm.org/D95425#inline-929792
Ah right, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54/new/

https://reviews.llvm.org/D54

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D111914: [WebAssembly] Add prototype relaxed laneselect instructions

2021-10-15 Thread Thomas Lively via Phabricator via cfe-commits

tlively closed this revision.
tlively added a comment.

I forgot to include the revision URL in the commit message, but this has been 
landed as rGda07942834fe 



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111914/new/

https://reviews.llvm.org/D111914

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D111914: [WebAssembly] Add prototype relaxed laneselect instructions

2021-10-15 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

Awesome, thanks! I'll go ahead and land this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111914/new/

https://reviews.llvm.org/D111914

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D110295: [WebAssembly] Add prototype relaxed SIMD fma/fms instructions

2021-09-23 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2f519825ba56: [WebAssembly] Add prototype relaxed SIMD 
fma/fms instructions (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110295/new/

https://reviews.llvm.org/D110295

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -1,4 +1,4 @@
-# RUN: llvm-mc -no-type-check -show-encoding -triple=wasm32-unknown-unknown -mattr=+simd128 < %s | FileCheck %s
+# RUN: llvm-mc -no-type-check -show-encoding -triple=wasm32-unknown-unknown -mattr=+simd128,+relaxed-simd < %s | FileCheck %s
 
 main:
 .functype main () -> ()
@@ -779,4 +779,16 @@
 # CHECK: f64x2.convert_low_i32x4_u # encoding: [0xfd,0xff,0x01]
 f64x2.convert_low_i32x4_u
 
+# CHECK: f32x4.fma # encoding: [0xfd,0xaf,0x01]
+f32x4.fma
+
+# CHECK: f32x4.fms # encoding: [0xfd,0xb0,0x01]
+f32x4.fms
+
+# CHECK: f64x2.fma # encoding: [0xfd,0xcf,0x01]
+f64x2.fma
+
+# CHECK: f64x2.fms # encoding: [0xfd,0xd0,0x01]
+f64x2.fms
+
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -1,5 +1,5 @@
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s --check-prefixes=CHECK,SLOW
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 -fast-isel | FileCheck %s
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128,+relaxed-simd | FileCheck %s --check-prefixes=CHECK,SLOW
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128,+relaxed-simd -fast-isel | FileCheck %s
 
 ; Test that SIMD128 intrinsics lower as expected. These intrinsics are
 ; only expected to lower successfully if the simd128 attribute is
@@ -600,6 +600,30 @@
   ret <4 x float> %v
 }
 
+; CHECK-LABEL: fma_v4f32:
+; CHECK-NEXT: .functype fma_v4f32 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.fma $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.fma.v4f32(<4 x float>, <4 x float>, <4 x float>)
+define <4 x float> @fma_v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c) {
+  %v = call <4 x float> @llvm.wasm.fma.v4f32(
+<4 x float> %a, <4 x float> %b, <4 x float> %c
+  )
+  ret <4 x float> %v
+}
+
+; CHECK-LABEL: fms_v4f32:
+; CHECK-NEXT: .functype fms_v4f32 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.fms $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.fms.v4f32(<4 x float>, <4 x float>, <4 x float>)
+define <4 x float> @fms_v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c) {
+  %v = call <4 x float> @llvm.wasm.fms.v4f32(
+<4 x float> %a, <4 x float> %b, <4 x float> %c
+  )
+  ret <4 x float> %v
+}
+
 ; ==
 ; 2 x f64
 ; ==
@@ -674,3 +698,27 @@
   %v = call <2 x double> @llvm.nearbyint.v2f64(<2 x double> %a)
   ret <2 x double> %v
 }
+
+; CHECK-LABEL: fma_v2f64:
+; CHECK-NEXT: .functype fma_v2f64 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.fma $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.fma.v2f64(<2 x double>, <2 x double>, <2 x double>)
+define <2 x double> @fma_v2f64(<2 x double> %a, <2 x double> %b, <2 x double> %c) {
+  %v = call <2 x double> @llvm.wasm.fma.v2f64(
+<2 x double> %a, <2 x double> %b, <2 x double> %c
+  )
+  ret <2 x double> %v
+}
+
+; CHECK-LABEL: fms_v2f64:
+; CHECK-NEXT: .functype fms_v2f64 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.fms $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.fms.v2f64(<2 x double>, <2 x double>, <2 x double>)
+define <2

[PATCH] D110295: [WebAssembly] Add prototype relaxed SIMD fma/fms instructions

2021-09-23 Thread Thomas Lively via Phabricator via cfe-commits

tlively updated this revision to Diff 374606.
tlively added a comment.

- Fix comment


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110295/new/

https://reviews.llvm.org/D110295

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -1,4 +1,4 @@
-# RUN: llvm-mc -no-type-check -show-encoding -triple=wasm32-unknown-unknown -mattr=+simd128 < %s | FileCheck %s
+# RUN: llvm-mc -no-type-check -show-encoding -triple=wasm32-unknown-unknown -mattr=+simd128,+relaxed-simd < %s | FileCheck %s
 
 main:
 .functype main () -> ()
@@ -779,4 +779,16 @@
 # CHECK: f64x2.convert_low_i32x4_u # encoding: [0xfd,0xff,0x01]
 f64x2.convert_low_i32x4_u
 
+# CHECK: f32x4.fma # encoding: [0xfd,0xaf,0x01]
+f32x4.fma
+
+# CHECK: f32x4.fms # encoding: [0xfd,0xb0,0x01]
+f32x4.fms
+
+# CHECK: f64x2.fma # encoding: [0xfd,0xcf,0x01]
+f64x2.fma
+
+# CHECK: f64x2.fms # encoding: [0xfd,0xd0,0x01]
+f64x2.fms
+
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -1,5 +1,5 @@
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s --check-prefixes=CHECK,SLOW
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 -fast-isel | FileCheck %s
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128,+relaxed-simd | FileCheck %s --check-prefixes=CHECK,SLOW
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128,+relaxed-simd -fast-isel | FileCheck %s
 
 ; Test that SIMD128 intrinsics lower as expected. These intrinsics are
 ; only expected to lower successfully if the simd128 attribute is
@@ -600,6 +600,30 @@
   ret <4 x float> %v
 }
 
+; CHECK-LABEL: fma_v4f32:
+; CHECK-NEXT: .functype fma_v4f32 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.fma $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.fma.v4f32(<4 x float>, <4 x float>, <4 x float>)
+define <4 x float> @fma_v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c) {
+  %v = call <4 x float> @llvm.wasm.fma.v4f32(
+<4 x float> %a, <4 x float> %b, <4 x float> %c
+  )
+  ret <4 x float> %v
+}
+
+; CHECK-LABEL: fms_v4f32:
+; CHECK-NEXT: .functype fms_v4f32 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.fms $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.fms.v4f32(<4 x float>, <4 x float>, <4 x float>)
+define <4 x float> @fms_v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c) {
+  %v = call <4 x float> @llvm.wasm.fms.v4f32(
+<4 x float> %a, <4 x float> %b, <4 x float> %c
+  )
+  ret <4 x float> %v
+}
+
 ; ==
 ; 2 x f64
 ; ==
@@ -674,3 +698,27 @@
   %v = call <2 x double> @llvm.nearbyint.v2f64(<2 x double> %a)
   ret <2 x double> %v
 }
+
+; CHECK-LABEL: fma_v2f64:
+; CHECK-NEXT: .functype fma_v2f64 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.fma $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.fma.v2f64(<2 x double>, <2 x double>, <2 x double>)
+define <2 x double> @fma_v2f64(<2 x double> %a, <2 x double> %b, <2 x double> %c) {
+  %v = call <2 x double> @llvm.wasm.fma.v2f64(
+<2 x double> %a, <2 x double> %b, <2 x double> %c
+  )
+  ret <2 x double> %v
+}
+
+; CHECK-LABEL: fms_v2f64:
+; CHECK-NEXT: .functype fms_v2f64 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.fms $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.fms.v2f64(<2 x double>, <2 x double>, <2 x double>)
+define <2 x double> @fms_v2f64(<2 x double> %a, <2 x double> %b, <2 x double> %c) {
+  %v = call <2 x double> @llvm.wasm.fms.v2f64(
+<2 x double> %a, <2 x double> %b,

[PATCH] D110295: [WebAssembly] Add prototype relaxed SIMD fma/fms instructions

2021-09-22 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, ngzhian.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100, dschuff.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Add experimental clang builtins, LLVM intrinsics, and backend definitions for
the new {f32x4,f64x2}.{fma,fms} instructions in the relaxed SIMD proposal:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.
Do not allow these instructions to be selected without explicit user opt-in.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D110295

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/MC/WebAssembly/simd-encodings.s

Index: llvm/test/MC/WebAssembly/simd-encodings.s
===
--- llvm/test/MC/WebAssembly/simd-encodings.s
+++ llvm/test/MC/WebAssembly/simd-encodings.s
@@ -1,4 +1,4 @@
-# RUN: llvm-mc -no-type-check -show-encoding -triple=wasm32-unknown-unknown -mattr=+simd128 < %s | FileCheck %s
+# RUN: llvm-mc -no-type-check -show-encoding -triple=wasm32-unknown-unknown -mattr=+simd128,+relaxed-simd < %s | FileCheck %s
 
 main:
 .functype main () -> ()
@@ -779,4 +779,16 @@
 # CHECK: f64x2.convert_low_i32x4_u # encoding: [0xfd,0xff,0x01]
 f64x2.convert_low_i32x4_u
 
+# CHECK: f32x4.fma # encoding: [0xfd,0xaf,0x01]
+f32x4.fma
+
+# CHECK: f32x4.fms # encoding: [0xfd,0xb0,0x01]
+f32x4.fms
+
+# CHECK: f64x2.fma # encoding: [0xfd,0xcf,0x01]
+f64x2.fma
+
+# CHECK: f64x2.fms # encoding: [0xfd,0xd0,0x01]
+f64x2.fms
+
 end_function
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -1,5 +1,5 @@
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s --check-prefixes=CHECK,SLOW
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 -fast-isel | FileCheck %s
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128,+relaxed-simd | FileCheck %s --check-prefixes=CHECK,SLOW
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128,+relaxed-simd -fast-isel | FileCheck %s
 
 ; Test that SIMD128 intrinsics lower as expected. These intrinsics are
 ; only expected to lower successfully if the simd128 attribute is
@@ -600,6 +600,30 @@
   ret <4 x float> %v
 }
 
+; CHECK-LABEL: fma_v4f32:
+; CHECK-NEXT: .functype fma_v4f32 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.fma $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.fma.v4f32(<4 x float>, <4 x float>, <4 x float>)
+define <4 x float> @fma_v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c) {
+  %v = call <4 x float> @llvm.wasm.fma.v4f32(
+<4 x float> %a, <4 x float> %b, <4 x float> %c
+  )
+  ret <4 x float> %v
+}
+
+; CHECK-LABEL: fms_v4f32:
+; CHECK-NEXT: .functype fms_v4f32 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.fms $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.fms.v4f32(<4 x float>, <4 x float>, <4 x float>)
+define <4 x float> @fms_v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c) {
+  %v = call <4 x float> @llvm.wasm.fms.v4f32(
+<4 x float> %a, <4 x float> %b, <4 x float> %c
+  )
+  ret <4 x float> %v
+}
+
 ; ==
 ; 2 x f64
 ; ==
@@ -674,3 +698,27 @@
   %v = call <2 x double> @llvm.nearbyint.v2f64(<2 x double> %a)
   ret <2 x double> %v
 }
+
+; CHECK-LABEL: fma_v2f64:
+; CHECK-NEXT: .functype fma_v2f64 (v128, v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.fma $push[[R:[0-9]+]]=, $0, $1, $2{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.fma.v2f64(<2 x double>, <2 x double>, <2 x double>)
+define <2 x double> @fma_v2f64(<2 x double> %a, <2 x double> %b, <2 x double> %c) {
+  %v = call <2 x double> @llvm.wasm.fma.v2f64(
+<2 x double> %a, <2 x double> %b, <2 x double> %c
+  )
+  ret

[PATCH] D110111: [WebAssembly] Add relaxed-simd feature

2021-09-22 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1552179ac019: [WebAssembly] Add relaxed-simd feature 
(authored by ngzhian, committed by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110111/new/

https://reviews.llvm.org/D110111

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Basic/Targets/WebAssembly.cpp
  clang/lib/Basic/Targets/WebAssembly.h
  clang/test/Preprocessor/wasm-target-features.c
  llvm/lib/Target/WebAssembly/WebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h

Index: llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h
===
--- llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h
+++ llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h
@@ -36,6 +36,7 @@
   enum SIMDEnum {
 NoSIMD,
 SIMD128,
+RelaxedSIMD,
   } SIMDLevel = NoSIMD;
 
   bool HasAtomics = false;
@@ -89,6 +90,7 @@
   // Predicates used by WebAssemblyInstrInfo.td.
   bool hasAddr64() const { return TargetTriple.isArch64Bit(); }
   bool hasSIMD128() const { return SIMDLevel >= SIMD128; }
+  bool hasRelaxedSIMD() const { return SIMDLevel >= RelaxedSIMD; }
   bool hasAtomics() const { return HasAtomics; }
   bool hasNontrappingFPToInt() const { return HasNontrappingFPToInt; }
   bool hasSignExt() const { return HasSignExt; }
Index: llvm/lib/Target/WebAssembly/WebAssembly.td
===
--- llvm/lib/Target/WebAssembly/WebAssembly.td
+++ llvm/lib/Target/WebAssembly/WebAssembly.td
@@ -25,6 +25,9 @@
 def FeatureSIMD128 : SubtargetFeature<"simd128", "SIMDLevel", "SIMD128",
   "Enable 128-bit SIMD">;
 
+def FeatureRelaxedSIMD : SubtargetFeature<"relaxed-simd", "SIMDLevel", "RelaxedSIMD",
+  "Enable relaxed-simd instructions">;
+
 def FeatureAtomics : SubtargetFeature<"atomics", "HasAtomics", "true",
   "Enable Atomics">;
 
Index: clang/test/Preprocessor/wasm-target-features.c
===
--- clang/test/Preprocessor/wasm-target-features.c
+++ clang/test/Preprocessor/wasm-target-features.c
@@ -7,6 +7,15 @@
 //
 // SIMD128:#define __wasm_simd128__ 1{{$}}
 
+// RUN: %clang -E -dM %s -o - 2>&1 \
+// RUN: -target wasm32-unknown-unknown -mrelaxed-simd \
+// RUN:   | FileCheck %s -check-prefix=RELAXED-SIMD
+// RUN: %clang -E -dM %s -o - 2>&1 \
+// RUN: -target wasm64-unknown-unknown -mrelaxed-simd \
+// RUN:   | FileCheck %s -check-prefix=RELAXED-SIMD
+//
+// RELAXED-SIMD:#define __wasm_relaxed_simd__ 1{{$}}
+
 // RUN: %clang -E -dM %s -o - 2>&1 \
 // RUN: -target wasm32-unknown-unknown -mnontrapping-fptoint \
 // RUN:   | FileCheck %s -check-prefix=NONTRAPPING-FPTOINT
Index: clang/lib/Basic/Targets/WebAssembly.h
===
--- clang/lib/Basic/Targets/WebAssembly.h
+++ clang/lib/Basic/Targets/WebAssembly.h
@@ -27,6 +27,7 @@
   enum SIMDEnum {
 NoSIMD,
 SIMD128,
+RelaxedSIMD,
   } SIMDLevel = NoSIMD;
 
   bool HasNontrappingFPToInt = false;
Index: clang/lib/Basic/Targets/WebAssembly.cpp
===
--- clang/lib/Basic/Targets/WebAssembly.cpp
+++ clang/lib/Basic/Targets/WebAssembly.cpp
@@ -46,6 +46,7 @@
 bool WebAssemblyTargetInfo::hasFeature(StringRef Feature) const {
   return llvm::StringSwitch(Feature)
   .Case("simd128", SIMDLevel >= SIMD128)
+  .Case("relaxed-simd", SIMDLevel >= RelaxedSIMD)
   .Case("nontrapping-fptoint", HasNontrappingFPToInt)
   .Case("sign-ext", HasSignExt)
   .Case("exception-handling", HasExceptionHandling)
@@ -72,6 +73,8 @@
   defineCPUMacros(Builder, "wasm", /*Tuning=*/false);
   if (SIMDLevel >= SIMD128)
 Builder.defineMacro("__wasm_simd128__");
+  if (SIMDLevel >= RelaxedSIMD)
+Builder.defineMacro("__wasm_relaxed_simd__");
   if (HasNontrappingFPToInt)
 Builder.defineMacro("__wasm_nontrapping_fptoint__");
   if (HasSignExt)
@@ -96,6 +99,9 @@
  SIMDEnum Level, bool Enabled) {
   if (Enabled) {
 switch (Level) {
+case RelaxedSIMD:
+  Features["relaxed-simd"] = true;
+  LLVM_FALLTHROUGH;
 case SIMD128:
   Features["simd128"] = true;
   LLVM_FALLTHROUGH;
@@ -109,6 +115,9 @@
   case NoSIMD:
   case SIMD128:
 Features["simd128"] = false;
+LLVM_FALLTHROUGH;
+  case RelaxedSIMD:
+Features["relaxed-simd"] = false;
 break;
   }
 }
@@ -118,6 +127,8 @@
   bool Enabled) const {
   if (Name == "simd128")
 setSIMDLevel(Features, SIMD128, Enabled);
+  else if (Name == "relaxed-simd")
+setSIMDLevel(Features, RelaxedSIMD, Enabled);
   else
 Features[Name] =

[PATCH] D110111: [WebAssembly] Add relaxed-simd feature

2021-09-22 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

Great, thanks! I will take care of landing this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110111/new/

https://reviews.llvm.org/D110111

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D110111: [WebAssembly] Add relaxed-simd feature

2021-09-21 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

Nice! Thanks for writing this :D Do you know what happens when you actually try 
to compile some code with `-mrelaxed-simd`? I'm concerned that it will throw an 
error because the "relaxed-simd" target feature has not yet been defined in the 
backend (specifically in WebAssembly.td and in WebAssemblySubtarget.{h,cpp}).




Comment at: clang/lib/Basic/Targets/WebAssembly.cpp:105-107
+case RelaxedSIMD:
+  Features["relaxed-simd"] = true;
+  LLVM_FALLTHROUGH;

I believe this should be above the `SIMD128` case so that when `RelaxedSIMD` is 
enabled it falls through and marks `SIMD128` as enabled as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110111/new/

https://reviews.llvm.org/D110111

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D108464: [clang][CodeGen] Refactor CreateTempAlloca function nest. NFC.

2021-08-23 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

In D108464#2960623 , @rjmccall wrote:

> + JF, who knows something about Web Assembly, or can at least drag in the 
> right people
>
> In D108464#2959591 , @wingo wrote:
>
>> In D108464#2957276 , @wingo wrote:
>>
>>> So... besides the refactor, this is getting closer to where I'm going 
>>> in https://lists.llvm.org/pipermail/cfe-dev/2021-July/068559.html, though 
>>> still NFC.  I think you can see where I would replace 
>>> `getASTAllocaAddressSpace` with `getAllocaAddressSpace(QualType Ty)`, and 
>>> possibly (depending on the source language) avoid casting the resulting 
>>> alloca to `LangAS::Default`.  WDYT, is this sort of thing OK?
>>
>> Taking this patch as perhaps a better generic discussion point, @rjmccall 
>> graciously gave some general feedback on this approach (thank you!!!):
>>
>> In D108360#2957844 , @rjmccall 
>> wrote:
>>
>>> I'm not sure that I agree with your overall plan, though:
>>>
>>> - The WebAssembly operand stack is not a good match for an address space at 
>>> the language level because it's not addressable at all.  If you can't 
>>> meaningfully have a pointer into the address space, then you don't really 
>>> need this in the type system; it's more like a declaration modifier at best.
>>> - Allocating local variables on the operand stack ought to be a very 
>>> straightforward analysis in the backend.  There's not much optimization 
>>> value in trying to do it in the frontend, and it's going to be problematic 
>>> for things like coroutine lowering.
>>> - The security argument seems pretty weak, not because security isn't 
>>> important but because this is not really an adequate basis for getting the 
>>> tamper-proof guarantee you want.  For example, LLVM passes can and do 
>>> introduce its own allocas and store scalars into them sometimes.  Really 
>>> you need some sort of "tamper-proof" *type* which the compiler can make an 
>>> end-to-end guarantee of non-tamper-ability for the values of, and while 
>>> optimally this would be implemented by just keeping values on the operand 
>>> stack, in the general case you will need to have some sort of strategy for 
>>> keeping things in memory.
>>
>> Thanks for thinking about this!  Indeed I started out with the goal of not 
>> going deep into clang and if it's possible to avoid going too deeply, that 
>> would be better for everyone involved.  I am starting to think however that 
>> it may be unavoidable for me at least.
>>
>> So, I am focusing on WebAssembly global and local variables; the WebAssembly 
>> operand stack is an artifact of the IR-to-MC lowering and AFAICS doesn't 
>> have any bearing on what clang does -- though perhaps I am misunderstanding 
>> what you are getting at here.  The issue is not to allocate locals on the 
>> operand stack, but rather to allocate them as part of the "locals" of a 
>> WebAssembly function 
>> .  Cc 
>> @tlively on the WebAssembly side.
>
> By "operand stack" I mean the innate, unaddressable stack that the 
> WebAssembly VM maintains in order to make functions reentrant.  I don't know 
> what term the VM spec uses for it, but I believe "operand stack" is widely 
> accepted terminology for the unaddressable stack when you've got this kind of 
> dual-stack setup.  And yes, VM "locals" would go there.

@wingo, are there cases where it is useful to declare variables as living in 
WebAssembly locals and not in the VM stack? I'm having trouble coming up with a 
case where leaving that up to the backend is not enough. We clearly need a way 
to prevent values from being written to main memory (AS 0), but it's not clear 
to me that we need a way to specifically allocate locals for them.

>> The main motivator is the ability to have "reference type" 
>>  
>> (`externref`/`funcref`) locals and globals 
>>  at 
>> all.  Reference-typed values can't be stored to linear memory.  They have no 
>> size and no byte representation -- they are opaque values from the host.  
>> However, WebAssembly locals and globals can define storage locations of type 
>> `externref` or `funcref`.
>
> I see.  I think you need to think carefully about the best way to represent 
> values of these types in LLVM IR, because it probably cannot just be "treat 
> them as a normal value, emit code a certain way that we know how to lower, 
> and hope nothing goes wrong".  It seems to me that you probably need a new IR 
> type for it, since normal types aren't restricted from memory and tokens 
> can't be used as parameters or return values.
>
> Hopefully, someone had a plan for this when they introduced that WebAssembly

[PATCH] D108387: [WebAssembly] Restore builtins and intrinsics for pmin/pmax

2021-08-20 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG88962cea4680: [WebAssembly] Restore builtins and intrinsics 
for pmin/pmax (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108387/new/

https://reviews.llvm.org/D108387

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -540,6 +540,26 @@
   ret <4 x float> %a
 }
 
+; CHECK-LABEL: pmin_v4f32:
+; CHECK-NEXT: .functype pmin_v4f32 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.pmin.v4f32(<4 x float>, <4 x float>)
+define <4 x float> @pmin_v4f32(<4 x float> %a, <4 x float> %b) {
+  %v = call <4 x float> @llvm.wasm.pmin.v4f32(<4 x float> %a, <4 x float> %b)
+  ret <4 x float> %v
+}
+
+; CHECK-LABEL: pmax_v4f32:
+; CHECK-NEXT: .functype pmax_v4f32 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.pmax.v4f32(<4 x float>, <4 x float>)
+define <4 x float> @pmax_v4f32(<4 x float> %a, <4 x float> %b) {
+  %v = call <4 x float> @llvm.wasm.pmax.v4f32(<4 x float> %a, <4 x float> %b)
+  ret <4 x float> %v
+}
+
 ; CHECK-LABEL: ceil_v4f32:
 ; CHECK-NEXT: .functype ceil_v4f32 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f32x4.ceil $push[[R:[0-9]+]]=, $0{{$}}
@@ -595,6 +615,26 @@
   ret <2 x double> %a
 }
 
+; CHECK-LABEL: pmin_v2f64:
+; CHECK-NEXT: .functype pmin_v2f64 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.pmin.v2f64(<2 x double>, <2 x double>)
+define <2 x double> @pmin_v2f64(<2 x double> %a, <2 x double> %b) {
+  %v = call <2 x double> @llvm.wasm.pmin.v2f64(<2 x double> %a, <2 x double> %b)
+  ret <2 x double> %v
+}
+
+; CHECK-LABEL: pmax_v2f64:
+; CHECK-NEXT: .functype pmax_v2f64 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.pmax.v2f64(<2 x double>, <2 x double>)
+define <2 x double> @pmax_v2f64(<2 x double> %a, <2 x double> %b) {
+  %v = call <2 x double> @llvm.wasm.pmax.v2f64(<2 x double> %a, <2 x double> %b)
+  ret <2 x double> %v
+}
+
 ; CHECK-LABEL: ceil_v2f64:
 ; CHECK-NEXT: .functype ceil_v2f64 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f64x2.ceil $push[[R:[0-9]+]]=, $0{{$}}
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1175,6 +1175,16 @@
   (pmax $lhs, $rhs)>;
 }
 
+// And match the pmin/pmax LLVM intrinsics as well
+def : Pat<(v4f32 (int_wasm_pmin (v4f32 V128:$lhs), (v4f32 V128:$rhs))),
+  (PMIN_F32x4 V128:$lhs, V128:$rhs)>;
+def : Pat<(v4f32 (int_wasm_pmax (v4f32 V128:$lhs), (v4f32 V128:$rhs))),
+  (PMAX_F32x4 V128:$lhs, V128:$rhs)>;
+def : Pat<(v2f64 (int_wasm_pmin (v2f64 V128:$lhs), (v2f64 V128:$rhs))),
+  (PMIN_F64x2 V128:$lhs, V128:$rhs)>;
+def : Pat<(v2f64 (int_wasm_pmax (v2f64 V128:$lhs), (v2f64 V128:$rhs))),
+  (PMAX_F64x2 V128:$lhs, V128:$rhs)>;
+
 //===--===//
 // Conversions
 //===--===//
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -164,6 +164,15 @@
 [llvm_v8i16_ty, llvm_v8i16_ty],
 [IntrNoMem, IntrSpeculatable]>;
 
+def int_wasm_pmin :
+  Intrinsic<[llvm_anyvector_ty],
+[LLVMMatchType<0>, LLVMMatchType<0>],
+[IntrNoMem, IntrSpeculatable]>;
+def int_wasm_pmax :
+  Intrinsic<[llvm_anyvector_ty],
+[LLVMMatchType<0>, LLVMMatchType<0>],
+[IntrNoMem, IntrSpeculatable]>;
+
 def int_wasm_extadd_pairwise_signed :
   Intrinsic<[llvm_anyvector_ty],
 [LLVMSubdivide2VectorType<0>],
Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -2424,11 +2424,11 @@
 
 // CHECK-LABEL:

[PATCH] D108415: [WebAssembly] Make shift values unsigned in wasm_simd128.h

2021-08-20 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG64a9957bf7b6: [WebAssembly] Make shift values unsigned in 
wasm_simd128.h (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108415/new/

https://reviews.llvm.org/D108415

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1603,7 +1603,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i8x16_shl(v128_t a, int32_t b) {
+v128_t test_i8x16_shl(v128_t a, uint32_t b) {
   return wasm_i8x16_shl(a, b);
 }
 
@@ -1617,7 +1617,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i8x16_shr(v128_t a, int32_t b) {
+v128_t test_i8x16_shr(v128_t a, uint32_t b) {
   return wasm_i8x16_shr(a, b);
 }
 
@@ -1631,7 +1631,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_u8x16_shr(v128_t a, int32_t b) {
+v128_t test_u8x16_shr(v128_t a, uint32_t b) {
   return wasm_u8x16_shr(a, b);
 }
 
@@ -1824,7 +1824,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i16x8_shl(v128_t a, int32_t b) {
+v128_t test_i16x8_shl(v128_t a, uint32_t b) {
   return wasm_i16x8_shl(a, b);
 }
 
@@ -1838,7 +1838,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i16x8_shr(v128_t a, int32_t b) {
+v128_t test_i16x8_shr(v128_t a, uint32_t b) {
   return wasm_i16x8_shr(a, b);
 }
 
@@ -1852,7 +1852,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_u16x8_shr(v128_t a, int32_t b) {
+v128_t test_u16x8_shr(v128_t a, uint32_t b) {
   return wasm_u16x8_shr(a, b);
 }
 
@@ -2048,7 +2048,7 @@
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHL_I]]
 //
-v128_t test_i32x4_shl(v128_t a, int32_t b) {
+v128_t test_i32x4_shl(v128_t a, uint32_t b) {
   return wasm_i32x4_shl(a, b);
 }
 
@@ -2059,7 +2059,7 @@
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHR_I]]
 //
-v128_t test_i32x4_shr(v128_t a, int32_t b) {
+v128_t test_i32x4_shr(v128_t a, uint32_t b) {
   return wasm_i32x4_shr(a, b);
 }
 
@@ -2070,7 +2070,7 @@
 // CHECK-NEXT:[[SHR_I:%.*]] = lshr <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHR_I]]
 //
-v128_t test_u32x4_shr(v128_t a, int32_t b) {
+v128_t test_u32x4_shr(v128_t a, uint32_t b) {
   return wasm_u32x4_shr(a, b);
 }
 
@@ -2198,42 +2198,42 @@
 // CHECK-LABEL: @test_i64x2_shl(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <2 x i64>
-// CHECK-NEXT:[[CONV_I:%.*]] = sext i32 [[B:%.*]] to i64
+// CHECK-NEXT:[[CONV_I:%.*]] = zext i32 [[B:%.*]] to i64
 // CHECK-NEXT:[[SPLAT_SPLATINSERT_I:%.*]] = insertelement <2 x i64> poison, i64 [[CONV_I]], i32 0
 // CHECK-NEXT:[[SPLAT_SPLAT_I:%.*]] = shufflevector <2 x i64> [[SPLAT_SPLATINSERT_I]], <2 x i64> poison, <2 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <2 x i64> [[TMP0]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x i64> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
-v128_t test_i64x2_shl(v128_t a, int32_t b) {
+v128_t test_i64x2_shl(v128_t a, uint32_t b) {
   return wasm_i64x2_shl(a, b);
 }
 
 // CHECK-LABEL: @test_i64x2_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <2 x i64>
-// CHECK-NEXT:[[CONV_I:%.*]] = sext i32 [[B:%.*]] to i64
+// CHECK-NEXT:[[CONV_I:%.*]] = zext i32 [[B:%.*]] to i64
 // CHECK-NEXT:[[SPLAT_SPLATINSERT_I:%.*]] = insertelement <2 x i64> poison, i64 [[CONV_I]], i32 0
 // CHECK-NEXT:[[SPLAT_SPLAT_I:%.*]] = shufflevector <2 x i64> [[SPLAT_SPLATINSERT_I]], <2 x i64> poison, <2 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <2 x i64> [[TMP0]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x i64> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
-v128_t test_i64x2_shr(v128_t a, int32_t b) {
+v128_t test_i64x2_shr(v128_t a, uint32_t b) {
   return wasm_i64x2_shr(a, b);
 }
 
 // CHECK-LABEL: @test_u64x2_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <2 x i64>
-// CHECK-NEXT:[[CONV_I:%.*]] = sext i32 [[B:%.*]] to i64
+// CHECK-NEXT:[[CONV_I:%.*]] = zext i32 [[B:%.*]] to i64
 // CHECK-NEXT:

[PATCH] D108412: [WebAssembly] Add SIMD intrinsics using unsigned integers

2021-08-20 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2456e11614c1: [WebAssembly] Add SIMD intrinsics using 
unsigned integers (authored by tlively).

Changed prior to commit:
  https://reviews.llvm.org/D108412?vs=367622=367798#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108412/new/

https://reviews.llvm.org/D108412

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -3,7 +3,7 @@
 
 // FIXME: This should not be using -O2 and implicitly testing the entire IR opt pipeline.
 
-// RUN: %clang %s -O2 -emit-llvm -S -o - -target wasm32-unknown-unknown -msimd128 -Wcast-qual -fno-lax-vector-conversions -Werror | FileCheck %s
+// RUN: %clang %s -O2 -emit-llvm -S -o - -target wasm32-unknown-unknown -msimd128 -Wall -Weverything -Wno-missing-prototypes -fno-lax-vector-conversions -Werror | FileCheck %s
 
 #include 
 
@@ -213,7 +213,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store(void *mem, v128_t a) {
-  return wasm_v128_store(mem, a);
+  wasm_v128_store(mem, a);
 }
 
 // CHECK-LABEL: @test_v128_store8_lane(
@@ -224,7 +224,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store8_lane(uint8_t *ptr, v128_t vec) {
-  return wasm_v128_store8_lane(ptr, vec, 15);
+  wasm_v128_store8_lane(ptr, vec, 15);
 }
 
 // CHECK-LABEL: @test_v128_store16_lane(
@@ -235,7 +235,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store16_lane(uint16_t *ptr, v128_t vec) {
-  return wasm_v128_store16_lane(ptr, vec, 7);
+  wasm_v128_store16_lane(ptr, vec, 7);
 }
 
 // CHECK-LABEL: @test_v128_store32_lane(
@@ -245,7 +245,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store32_lane(uint32_t *ptr, v128_t vec) {
-  return wasm_v128_store32_lane(ptr, vec, 3);
+  wasm_v128_store32_lane(ptr, vec, 3);
 }
 
 // CHECK-LABEL: @test_v128_store64_lane(
@@ -256,7 +256,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store64_lane(uint64_t *ptr, v128_t vec) {
-  return wasm_v128_store64_lane(ptr, vec, 1);
+  wasm_v128_store64_lane(ptr, vec, 1);
 }
 
 // CHECK-LABEL: @test_i8x16_make(
@@ -284,6 +284,31 @@
   return wasm_i8x16_make(c0, c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15);
 }
 
+// CHECK-LABEL: @test_u8x16_make(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <16 x i8> undef, i8 [[C0:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT1_I:%.*]] = insertelement <16 x i8> [[VECINIT_I]], i8 [[C1:%.*]], i32 1
+// CHECK-NEXT:[[VECINIT2_I:%.*]] = insertelement <16 x i8> [[VECINIT1_I]], i8 [[C2:%.*]], i32 2
+// CHECK-NEXT:[[VECINIT3_I:%.*]] = insertelement <16 x i8> [[VECINIT2_I]], i8 [[C3:%.*]], i32 3
+// CHECK-NEXT:[[VECINIT4_I:%.*]] = insertelement <16 x i8> [[VECINIT3_I]], i8 [[C4:%.*]], i32 4
+// CHECK-NEXT:[[VECINIT5_I:%.*]] = insertelement <16 x i8> [[VECINIT4_I]], i8 [[C5:%.*]], i32 5
+// CHECK-NEXT:[[VECINIT6_I:%.*]] = insertelement <16 x i8> [[VECINIT5_I]], i8 [[C6:%.*]], i32 6
+// CHECK-NEXT:[[VECINIT7_I:%.*]] = insertelement <16 x i8> [[VECINIT6_I]], i8 [[C7:%.*]], i32 7
+// CHECK-NEXT:[[VECINIT8_I:%.*]] = insertelement <16 x i8> [[VECINIT7_I]], i8 [[C8:%.*]], i32 8
+// CHECK-NEXT:[[VECINIT9_I:%.*]] = insertelement <16 x i8> [[VECINIT8_I]], i8 [[C9:%.*]], i32 9
+// CHECK-NEXT:[[VECINIT10_I:%.*]] = insertelement <16 x i8> [[VECINIT9_I]], i8 [[C10:%.*]], i32 10
+// CHECK-NEXT:[[VECINIT11_I:%.*]] = insertelement <16 x i8> [[VECINIT10_I]], i8 [[C11:%.*]], i32 11
+// CHECK-NEXT:[[VECINIT12_I:%.*]] = insertelement <16 x i8> [[VECINIT11_I]], i8 [[C12:%.*]], i32 12
+// CHECK-NEXT:[[VECINIT13_I:%.*]] = insertelement <16 x i8> [[VECINIT12_I]], i8 [[C13:%.*]], i32 13
+// CHECK-NEXT:[[VECINIT14_I:%.*]] = insertelement <16 x i8> [[VECINIT13_I]], i8 [[C14:%.*]], i32 14
+// CHECK-NEXT:[[VECINIT15_I:%.*]] = insertelement <16 x i8> [[VECINIT14_I]], i8 [[C15:%.*]], i32 15
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <16 x i8> [[VECINIT15_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP0]]
+//
+v128_t test_u8x16_make(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3, uint8_t c4, uint8_t c5, uint8_t c6, uint8_t c7, uint8_t c8, uint8_t c9, uint8_t c10, uint8_t c11, uint8_t c12, uint8_t c13, uint8_t c14, uint8_t c15) {
+  return wasm_u8x16_make(c0, c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15);
+}
+
 // CHECK-LABEL: @test_i16x8_make(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <8 x i16> undef, i16 [[C0:%.*]], i32 0
@@ -301,6 +326,23 @@
   return wasm_i16x8_make(c0, c1, c2, c3, c4, c5, c6, c7);
 }
 
+// CHECK-LABEL: @test_u16x8_make(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <8 x i16> undef, i16 [[C0:%.*]], i32 0
+// CHECK-NEXT:

[PATCH] D108401: [WebAssembly] Make bitmask instructions return unsigned ints

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGfd3bd63df26a: [WebAssembly] Make bitmask instructions return 
unsigned ints (authored by tlively).

Changed prior to commit:
  https://reviews.llvm.org/D108401?vs=367591=367653#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108401/new/

https://reviews.llvm.org/D108401

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c


Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1345,7 +1345,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v16i8(<16 x 
i8> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i8x16_bitmask(v128_t a) {
+uint32_t test_i8x16_bitmask(v128_t a) {
   return wasm_i8x16_bitmask(a);
 }
 
@@ -1577,7 +1577,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v8i16(<8 x 
i16> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i16x8_bitmask(v128_t a) {
+uint32_t test_i16x8_bitmask(v128_t a) {
   return wasm_i16x8_bitmask(a);
 }
 
@@ -1804,7 +1804,7 @@
 // CHECK-NEXT:[[TMP0:%.*]] = tail call i32 @llvm.wasm.bitmask.v4i32(<4 x 
i32> [[A:%.*]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP0]]
 //
-int32_t test_i32x4_bitmask(v128_t a) {
+uint32_t test_i32x4_bitmask(v128_t a) {
   return wasm_i32x4_bitmask(a);
 }
 
@@ -1958,7 +1958,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v2i64(<2 x 
i64> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i64x2_bitmask(v128_t a) {
+uint32_t test_i64x2_bitmask(v128_t a) {
   return wasm_i64x2_bitmask(a);
 }
 
Index: clang/lib/Headers/wasm_simd128.h
===
--- clang/lib/Headers/wasm_simd128.h
+++ clang/lib/Headers/wasm_simd128.h
@@ -804,7 +804,7 @@
   return __builtin_wasm_all_true_i8x16((__i8x16)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i8x16((__i8x16)__a);
 }
 
@@ -894,7 +894,7 @@
   return __builtin_wasm_all_true_i16x8((__i16x8)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i16x8((__i16x8)__a);
 }
 
@@ -985,7 +985,7 @@
   return __builtin_wasm_all_true_i32x4((__i32x4)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i32x4((__i32x4)__a);
 }
 
@@ -1056,7 +1056,7 @@
   return __builtin_wasm_all_true_i64x2((__i64x2)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i64x2((__i64x2)__a);
 }
 
Index: clang/include/clang/Basic/BuiltinsWebAssembly.def
===
--- clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -119,10 +119,10 @@
 TARGET_BUILTIN(__builtin_wasm_all_true_i32x4, "iV4i", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_all_true_i64x2, "iV2LLi", "nc", "simd128")
 
-TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "iV16Sc", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "iV8s", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "iV4i", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "iV2LLi", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "UiV16Sc", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "UiV8s", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "UiV4i", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "UiV2LLi", "nc", "simd128")
 
 TARGET_BUILTIN(__builtin_wasm_abs_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_abs_f64x2, "V2dV2d", "nc", "simd128")


Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1345,7 +1345,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v16i8(<16 x i8> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i8x16_bitmask(v128_t a) {
+uint32_t test_i8x16_bitmask(v128_t a) {
   return wasm_i8x16_bitmask(a);
 }
 
@@ -1577,7 +1577,7 @@
 // CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.wasm.bitmask.v8i16(<8 x i16> [[TMP0]]) #[[ATTR6]]
 // CHECK-NEXT:ret i32 [[TMP1]]
 //
-int32_t test_i16x8_bitmask(v128_t a) {

[PATCH] D108401: [WebAssembly] Make bitmask instructions return unsigned ints

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

Thanks! Will move the relevant test changes up from 
https://reviews.llvm.org/D108412 to here before landing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108401/new/

https://reviews.llvm.org/D108401

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D108412: [WebAssembly] Add SIMD intrinsics using unsigned integers

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

In D108412#2955996 , @craig.topper 
wrote:

> Did you read this twitter thread too or just coincidence? 
> https://twitter.com/rygorous/status/1428207170403725316?s=20

Yes I did :D


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108412/new/

https://reviews.llvm.org/D108412

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D108415: [WebAssembly] Make shift values unsigned in wasm_simd128.h

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

On some platforms, negative shift values mean to shift in the opposite
direction, but this is not true with WebAssembly. To avoid confusion, make the
shift values in the shift intrinsics unsigned.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108415

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1603,7 +1603,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i8x16_shl(v128_t a, int32_t b) {
+v128_t test_i8x16_shl(v128_t a, uint32_t b) {
   return wasm_i8x16_shl(a, b);
 }
 
@@ -1617,7 +1617,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i8x16_shr(v128_t a, int32_t b) {
+v128_t test_i8x16_shr(v128_t a, uint32_t b) {
   return wasm_i8x16_shr(a, b);
 }
 
@@ -1631,7 +1631,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_u8x16_shr(v128_t a, int32_t b) {
+v128_t test_u8x16_shr(v128_t a, uint32_t b) {
   return wasm_u8x16_shr(a, b);
 }
 
@@ -1824,7 +1824,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i16x8_shl(v128_t a, int32_t b) {
+v128_t test_i16x8_shl(v128_t a, uint32_t b) {
   return wasm_i16x8_shl(a, b);
 }
 
@@ -1838,7 +1838,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_i16x8_shr(v128_t a, int32_t b) {
+v128_t test_i16x8_shr(v128_t a, uint32_t b) {
   return wasm_i16x8_shr(a, b);
 }
 
@@ -1852,7 +1852,7 @@
 // CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
-v128_t test_u16x8_shr(v128_t a, int32_t b) {
+v128_t test_u16x8_shr(v128_t a, uint32_t b) {
   return wasm_u16x8_shr(a, b);
 }
 
@@ -2048,7 +2048,7 @@
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHL_I]]
 //
-v128_t test_i32x4_shl(v128_t a, int32_t b) {
+v128_t test_i32x4_shl(v128_t a, uint32_t b) {
   return wasm_i32x4_shl(a, b);
 }
 
@@ -2059,7 +2059,7 @@
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHR_I]]
 //
-v128_t test_i32x4_shr(v128_t a, int32_t b) {
+v128_t test_i32x4_shr(v128_t a, uint32_t b) {
   return wasm_i32x4_shr(a, b);
 }
 
@@ -2070,7 +2070,7 @@
 // CHECK-NEXT:[[SHR_I:%.*]] = lshr <4 x i32> [[A:%.*]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:ret <4 x i32> [[SHR_I]]
 //
-v128_t test_u32x4_shr(v128_t a, int32_t b) {
+v128_t test_u32x4_shr(v128_t a, uint32_t b) {
   return wasm_u32x4_shr(a, b);
 }
 
@@ -2198,42 +2198,42 @@
 // CHECK-LABEL: @test_i64x2_shl(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <2 x i64>
-// CHECK-NEXT:[[CONV_I:%.*]] = sext i32 [[B:%.*]] to i64
+// CHECK-NEXT:[[CONV_I:%.*]] = zext i32 [[B:%.*]] to i64
 // CHECK-NEXT:[[SPLAT_SPLATINSERT_I:%.*]] = insertelement <2 x i64> poison, i64 [[CONV_I]], i32 0
 // CHECK-NEXT:[[SPLAT_SPLAT_I:%.*]] = shufflevector <2 x i64> [[SPLAT_SPLATINSERT_I]], <2 x i64> poison, <2 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <2 x i64> [[TMP0]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x i64> [[SHL_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
-v128_t test_i64x2_shl(v128_t a, int32_t b) {
+v128_t test_i64x2_shl(v128_t a, uint32_t b) {
   return wasm_i64x2_shl(a, b);
 }
 
 // CHECK-LABEL: @test_i64x2_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <2 x i64>
-// CHECK-NEXT:[[CONV_I:%.*]] = sext i32 [[B:%.*]] to i64
+// CHECK-NEXT:[[CONV_I:%.*]] = zext i32 [[B:%.*]] to i64
 // CHECK-NEXT:[[SPLAT_SPLATINSERT_I:%.*]] = insertelement <2 x i64> poison, i64 [[CONV_I]], i32 0
 // CHECK-NEXT:[[SPLAT_SPLAT_I:%.*]] = shufflevector <2 x i64> [[SPLAT_SPLATINSERT_I]], <2 x i64> poison, <2 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <2 x i64> [[TMP0]], [[SPLAT_SPLAT_I]]
 // CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x i64> [[SHR_I]] to <4 x i32>
 // CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
-v128_t test_i64x2_shr(v128_t a, int32_t b) {
+v128_t test_i64x2_shr(v128_t a, uint32_t b) {
   return wasm_i64x2_shr(a, b);
 }
 
 // CHECK-LABEL: @test_u64x2_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:

[PATCH] D108412: [WebAssembly] Add SIMD intrinsics using unsigned integers

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

For each SIMD intrinsic function that takes or returns a scalar signed integer
value, ensure there is a corresponding intrinsic that returns or an
unsigned value. This is a convenience for users who use -Wsign-conversion so
they don't have to insert explicit casts, especially when the intrinsic
arguments are integer literals that fit into the unsigned integer type but not
the signed type.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108412

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -3,7 +3,7 @@
 
 // FIXME: This should not be using -O2 and implicitly testing the entire IR opt pipeline.
 
-// RUN: %clang %s -O2 -emit-llvm -S -o - -target wasm32-unknown-unknown -msimd128 -Wcast-qual -fno-lax-vector-conversions -Werror | FileCheck %s
+// RUN: %clang %s -O2 -emit-llvm -S -o - -target wasm32-unknown-unknown -msimd128 -Wall -Weverything -Wno-missing-prototypes -fno-lax-vector-conversions -Werror | FileCheck %s
 
 #include 
 
@@ -213,7 +213,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store(void *mem, v128_t a) {
-  return wasm_v128_store(mem, a);
+  wasm_v128_store(mem, a);
 }
 
 // CHECK-LABEL: @test_v128_store8_lane(
@@ -224,7 +224,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store8_lane(uint8_t *ptr, v128_t vec) {
-  return wasm_v128_store8_lane(ptr, vec, 15);
+  wasm_v128_store8_lane(ptr, vec, 15);
 }
 
 // CHECK-LABEL: @test_v128_store16_lane(
@@ -235,7 +235,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store16_lane(uint16_t *ptr, v128_t vec) {
-  return wasm_v128_store16_lane(ptr, vec, 7);
+  wasm_v128_store16_lane(ptr, vec, 7);
 }
 
 // CHECK-LABEL: @test_v128_store32_lane(
@@ -245,7 +245,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store32_lane(uint32_t *ptr, v128_t vec) {
-  return wasm_v128_store32_lane(ptr, vec, 3);
+  wasm_v128_store32_lane(ptr, vec, 3);
 }
 
 // CHECK-LABEL: @test_v128_store64_lane(
@@ -256,7 +256,7 @@
 // CHECK-NEXT:ret void
 //
 void test_v128_store64_lane(uint64_t *ptr, v128_t vec) {
-  return wasm_v128_store64_lane(ptr, vec, 1);
+  wasm_v128_store64_lane(ptr, vec, 1);
 }
 
 // CHECK-LABEL: @test_i8x16_make(
@@ -284,6 +284,31 @@
   return wasm_i8x16_make(c0, c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15);
 }
 
+// CHECK-LABEL: @test_u8x16_make(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <16 x i8> undef, i8 [[C0:%.*]], i32 0
+// CHECK-NEXT:[[VECINIT1_I:%.*]] = insertelement <16 x i8> [[VECINIT_I]], i8 [[C1:%.*]], i32 1
+// CHECK-NEXT:[[VECINIT2_I:%.*]] = insertelement <16 x i8> [[VECINIT1_I]], i8 [[C2:%.*]], i32 2
+// CHECK-NEXT:[[VECINIT3_I:%.*]] = insertelement <16 x i8> [[VECINIT2_I]], i8 [[C3:%.*]], i32 3
+// CHECK-NEXT:[[VECINIT4_I:%.*]] = insertelement <16 x i8> [[VECINIT3_I]], i8 [[C4:%.*]], i32 4
+// CHECK-NEXT:[[VECINIT5_I:%.*]] = insertelement <16 x i8> [[VECINIT4_I]], i8 [[C5:%.*]], i32 5
+// CHECK-NEXT:[[VECINIT6_I:%.*]] = insertelement <16 x i8> [[VECINIT5_I]], i8 [[C6:%.*]], i32 6
+// CHECK-NEXT:[[VECINIT7_I:%.*]] = insertelement <16 x i8> [[VECINIT6_I]], i8 [[C7:%.*]], i32 7
+// CHECK-NEXT:[[VECINIT8_I:%.*]] = insertelement <16 x i8> [[VECINIT7_I]], i8 [[C8:%.*]], i32 8
+// CHECK-NEXT:[[VECINIT9_I:%.*]] = insertelement <16 x i8> [[VECINIT8_I]], i8 [[C9:%.*]], i32 9
+// CHECK-NEXT:[[VECINIT10_I:%.*]] = insertelement <16 x i8> [[VECINIT9_I]], i8 [[C10:%.*]], i32 10
+// CHECK-NEXT:[[VECINIT11_I:%.*]] = insertelement <16 x i8> [[VECINIT10_I]], i8 [[C11:%.*]], i32 11
+// CHECK-NEXT:[[VECINIT12_I:%.*]] = insertelement <16 x i8> [[VECINIT11_I]], i8 [[C12:%.*]], i32 12
+// CHECK-NEXT:[[VECINIT13_I:%.*]] = insertelement <16 x i8> [[VECINIT12_I]], i8 [[C13:%.*]], i32 13
+// CHECK-NEXT:[[VECINIT14_I:%.*]] = insertelement <16 x i8> [[VECINIT13_I]], i8 [[C14:%.*]], i32 14
+// CHECK-NEXT:[[VECINIT15_I:%.*]] = insertelement <16 x i8> [[VECINIT14_I]], i8 [[C15:%.*]], i32 15
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <16 x i8> [[VECINIT15_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP0]]
+//
+v128_t test_u8x16_make(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3, uint8_t c4, uint8_t c5, uint8_t c6, uint8_t c7, uint8_t c8, uint8_t c9, uint8_t c10, uint8_t c11, uint8_t c12, uint8_t c13, uint8_t c14, uint8_t c15) {
+  return wasm_u8x16_make(c0, c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15);
+}
+
 // CHECK-LABEL: @test_i16x8_make(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <8 x i16> undef, i16 [[C0:%.*]],

[PATCH] D108401: [WebAssembly] Make bitmask instructions return unsigned ints

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Since they are bitmasks, it will be more common for them to be used and
potentially extended to 64-bit integers as unsigned values rather than signed
values.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108401

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/Headers/wasm_simd128.h


Index: clang/lib/Headers/wasm_simd128.h
===
--- clang/lib/Headers/wasm_simd128.h
+++ clang/lib/Headers/wasm_simd128.h
@@ -804,7 +804,7 @@
   return __builtin_wasm_all_true_i8x16((__i8x16)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i8x16((__i8x16)__a);
 }
 
@@ -894,7 +894,7 @@
   return __builtin_wasm_all_true_i16x8((__i16x8)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i16x8((__i16x8)__a);
 }
 
@@ -985,7 +985,7 @@
   return __builtin_wasm_all_true_i32x4((__i32x4)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i32x4((__i32x4)__a);
 }
 
@@ -1056,7 +1056,7 @@
   return __builtin_wasm_all_true_i64x2((__i64x2)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i64x2((__i64x2)__a);
 }
 
Index: clang/include/clang/Basic/BuiltinsWebAssembly.def
===
--- clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -119,10 +119,10 @@
 TARGET_BUILTIN(__builtin_wasm_all_true_i32x4, "iV4i", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_all_true_i64x2, "iV2LLi", "nc", "simd128")
 
-TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "iV16Sc", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "iV8s", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "iV4i", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "iV2LLi", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "UiV16Sc", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "UiV8s", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "UiV4i", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "UiV2LLi", "nc", "simd128")
 
 TARGET_BUILTIN(__builtin_wasm_abs_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_abs_f64x2, "V2dV2d", "nc", "simd128")


Index: clang/lib/Headers/wasm_simd128.h
===
--- clang/lib/Headers/wasm_simd128.h
+++ clang/lib/Headers/wasm_simd128.h
@@ -804,7 +804,7 @@
   return __builtin_wasm_all_true_i8x16((__i8x16)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i8x16_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i8x16((__i8x16)__a);
 }
 
@@ -894,7 +894,7 @@
   return __builtin_wasm_all_true_i16x8((__i16x8)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i16x8_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i16x8((__i16x8)__a);
 }
 
@@ -985,7 +985,7 @@
   return __builtin_wasm_all_true_i32x4((__i32x4)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i32x4_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i32x4((__i32x4)__a);
 }
 
@@ -1056,7 +1056,7 @@
   return __builtin_wasm_all_true_i64x2((__i64x2)__a);
 }
 
-static __inline__ int32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
+static __inline__ uint32_t __DEFAULT_FN_ATTRS wasm_i64x2_bitmask(v128_t __a) {
   return __builtin_wasm_bitmask_i64x2((__i64x2)__a);
 }
 
Index: clang/include/clang/Basic/BuiltinsWebAssembly.def
===
--- clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -119,10 +119,10 @@
 TARGET_BUILTIN(__builtin_wasm_all_true_i32x4, "iV4i", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_all_true_i64x2, "iV2LLi", "nc", "simd128")
 
-TARGET_BUILTIN(__builtin_wasm_bitmask_i8x16, "iV16Sc", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8,

[PATCH] D108387: [WebAssembly] Restore builtins and intrinsics for pmin/pmax

2021-08-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Partially reverts 85157c007903 
, which 
had removed these builtins and intrinsics
in favor of normal codegen patterns. It turns out that it is possible for the
patterns to be split over multiple basic blocks, however, which means that DAG
ISel is not able to select them to the pmin/pmax instructions. To make sure the
SIMD intrinsics generate the correct instructions in these cases, reintroduce
the clang builtins and corresponding LLVM intrinsics, but also keep the normal
pattern matching as well.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D108387

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -540,6 +540,26 @@
   ret <4 x float> %a
 }
 
+; CHECK-LABEL: pmin_v4f32:
+; CHECK-NEXT: .functype pmin_v4f32 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.pmin.v4f32(<4 x float>, <4 x float>)
+define <4 x float> @pmin_v4f32(<4 x float> %a, <4 x float> %b) {
+  %v = call <4 x float> @llvm.wasm.pmin.v4f32(<4 x float> %a, <4 x float> %b)
+  ret <4 x float> %v
+}
+
+; CHECK-LABEL: pmax_v4f32:
+; CHECK-NEXT: .functype pmax_v4f32 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f32x4.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <4 x float> @llvm.wasm.pmax.v4f32(<4 x float>, <4 x float>)
+define <4 x float> @pmax_v4f32(<4 x float> %a, <4 x float> %b) {
+  %v = call <4 x float> @llvm.wasm.pmax.v4f32(<4 x float> %a, <4 x float> %b)
+  ret <4 x float> %v
+}
+
 ; CHECK-LABEL: ceil_v4f32:
 ; CHECK-NEXT: .functype ceil_v4f32 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f32x4.ceil $push[[R:[0-9]+]]=, $0{{$}}
@@ -595,6 +615,26 @@
   ret <2 x double> %a
 }
 
+; CHECK-LABEL: pmin_v2f64:
+; CHECK-NEXT: .functype pmin_v2f64 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.pmin.v2f64(<2 x double>, <2 x double>)
+define <2 x double> @pmin_v2f64(<2 x double> %a, <2 x double> %b) {
+  %v = call <2 x double> @llvm.wasm.pmin.v2f64(<2 x double> %a, <2 x double> %b)
+  ret <2 x double> %v
+}
+
+; CHECK-LABEL: pmax_v2f64:
+; CHECK-NEXT: .functype pmax_v2f64 (v128, v128) -> (v128){{$}}
+; CHECK-NEXT: f64x2.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; CHECK-NEXT: return $pop[[R]]{{$}}
+declare <2 x double> @llvm.wasm.pmax.v2f64(<2 x double>, <2 x double>)
+define <2 x double> @pmax_v2f64(<2 x double> %a, <2 x double> %b) {
+  %v = call <2 x double> @llvm.wasm.pmax.v2f64(<2 x double> %a, <2 x double> %b)
+  ret <2 x double> %v
+}
+
 ; CHECK-LABEL: ceil_v2f64:
 ; CHECK-NEXT: .functype ceil_v2f64 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f64x2.ceil $push[[R:[0-9]+]]=, $0{{$}}
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1165,6 +1165,16 @@
   (pmax $lhs, $rhs)>;
 }
 
+// And match the pmin/pmax LLVM intrinsics as well
+def : Pat<(v4f32 (int_wasm_pmin (v4f32 V128:$lhs), (v4f32 V128:$rhs))),
+  (PMIN_F32x4 V128:$lhs, V128:$rhs)>;
+def : Pat<(v4f32 (int_wasm_pmax (v4f32 V128:$lhs), (v4f32 V128:$rhs))),
+  (PMAX_F32x4 V128:$lhs, V128:$rhs)>;
+def : Pat<(v2f64 (int_wasm_pmin (v2f64 V128:$lhs), (v2f64 V128:$rhs))),
+  (PMIN_F64x2 V128:$lhs, V128:$rhs)>;
+def : Pat<(v2f64 (int_wasm_pmax (v2f64 V128:$lhs), (v2f64 V128:$rhs))),
+  (PMAX_F64x2 V128:$lhs, V128:$rhs)>;
+
 //===--===//
 // Conversions
 //===--===//
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -164,6 +164,15 @@
 [llvm_v8i16_ty, llvm_v8i16_ty],
 [IntrNoMem, IntrSpeculatable]>;
 
+def int_wasm_pmin :

[PATCH] D106724: [WebAssembly] Codegen for extmul SIMD instructions

2021-07-27 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG33786576fd3a: [WebAssembly] Codegen for extmul SIMD 
instructions (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106724/new/

https://reviews.llvm.org/D106724

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-arith.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -248,54 +248,6 @@
   ret <8 x i16> %a
 }
 
-; CHECK-LABEL: extmul_low_s_v8i16:
-; CHECK-NEXT: .functype extmul_low_s_v8i16 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.extmul_low_i8x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.extmul.low.signed.v8i16(<16 x i8>, <16 x i8>)
-define <8 x i16> @extmul_low_s_v8i16(<16 x i8> %x, <16 x i8> %y) {
-  %a = call <8 x i16> @llvm.wasm.extmul.low.signed.v8i16(
-<16 x i8> %x, <16 x i8> %y
-  )
-  ret <8 x i16> %a
-}
-
-; CHECK-LABEL: extmul_high_s_v8i16:
-; CHECK-NEXT: .functype extmul_high_s_v8i16 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.extmul_high_i8x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.extmul.high.signed.v8i16(<16 x i8>, <16 x i8>)
-define <8 x i16> @extmul_high_s_v8i16(<16 x i8> %x, <16 x i8> %y) {
-  %a = call <8 x i16> @llvm.wasm.extmul.high.signed.v8i16(
-<16 x i8> %x, <16 x i8> %y
-  )
-  ret <8 x i16> %a
-}
-
-; CHECK-LABEL: extmul_low_u_v8i16:
-; CHECK-NEXT: .functype extmul_low_u_v8i16 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.extmul_low_i8x16_u $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.extmul.low.unsigned.v8i16(<16 x i8>, <16 x i8>)
-define <8 x i16> @extmul_low_u_v8i16(<16 x i8> %x, <16 x i8> %y) {
-  %a = call <8 x i16> @llvm.wasm.extmul.low.unsigned.v8i16(
-<16 x i8> %x, <16 x i8> %y
-  )
-  ret <8 x i16> %a
-}
-
-; CHECK-LABEL: extmul_high_u_v8i16:
-; CHECK-NEXT: .functype extmul_high_u_v8i16 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.extmul_high_i8x16_u $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.extmul.high.unsigned.v8i16(<16 x i8>, <16 x i8>)
-define <8 x i16> @extmul_high_u_v8i16(<16 x i8> %x, <16 x i8> %y) {
-  %a = call <8 x i16> @llvm.wasm.extmul.high.unsigned.v8i16(
-<16 x i8> %x, <16 x i8> %y
-  )
-  ret <8 x i16> %a
-}
-
 ; CHECK-LABEL: extadd_pairwise_s_v8i16:
 ; CHECK-NEXT: .functype extadd_pairwise_s_v8i16 (v128) -> (v128){{$}}
 ; CHECK-NEXT: i16x8.extadd_pairwise_i8x16_s $push[[R:[0-9]+]]=, $0{{$}}
@@ -395,55 +347,6 @@
   ret <4 x i32> %a
 }
 
-
-; CHECK-LABEL: extmul_low_s_v4i32:
-; CHECK-NEXT: .functype extmul_low_s_v4i32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.extmul_low_i16x8_s $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.extmul.low.signed.v4i32(<8 x i16>, <8 x i16>)
-define <4 x i32> @extmul_low_s_v4i32(<8 x i16> %x, <8 x i16> %y) {
-  %a = call <4 x i32> @llvm.wasm.extmul.low.signed.v4i32(
-<8 x i16> %x, <8 x i16> %y
-  )
-  ret <4 x i32> %a
-}
-
-; CHECK-LABEL: extmul_high_s_v4i32:
-; CHECK-NEXT: .functype extmul_high_s_v4i32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.extmul_high_i16x8_s $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.extmul.high.signed.v4i32(<8 x i16>, <8 x i16>)
-define <4 x i32> @extmul_high_s_v4i32(<8 x i16> %x, <8 x i16> %y) {
-  %a = call <4 x i32> @llvm.wasm.extmul.high.signed.v4i32(
-<8 x i16> %x, <8 x i16> %y
-  )
-  ret <4 x i32> %a
-}
-
-; CHECK-LABEL: extmul_low_u_v4i32:
-; CHECK-NEXT: .functype extmul_low_u_v4i32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.extmul_low_i16x8_u $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.extmul.low.unsigned.v4i32(<8 x i16>, <8 x i16>)
-define <4 x i32> @extmul_low_u_v4i32(<8 x i16> %x, <8 x i16> %y) {
-  %a = call <4 x i32> @llvm.wasm.extmul.low.unsigned.v4i32(
-<8 x i16> %x, <8 x i16> %y
-  )
-  ret <4 x i32> %a
-}
-
-; CHECK-LABEL: extmul_high_u_v4i32:
-; CHECK-NEXT: .functype extmul_high_u_v4i32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.extmul_high_i16x8_u $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.extmul.high.unsigned.v4i32(<8 x i16>, <8 x i16>)
-define <4 x

[PATCH] D106724: [WebAssembly] Codegen for extmul SIMD instructions

2021-07-23 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions
with normal codegen patterns.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106724

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-arith.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -248,54 +248,6 @@
   ret <8 x i16> %a
 }
 
-; CHECK-LABEL: extmul_low_s_v8i16:
-; CHECK-NEXT: .functype extmul_low_s_v8i16 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.extmul_low_i8x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.extmul.low.signed.v8i16(<16 x i8>, <16 x i8>)
-define <8 x i16> @extmul_low_s_v8i16(<16 x i8> %x, <16 x i8> %y) {
-  %a = call <8 x i16> @llvm.wasm.extmul.low.signed.v8i16(
-<16 x i8> %x, <16 x i8> %y
-  )
-  ret <8 x i16> %a
-}
-
-; CHECK-LABEL: extmul_high_s_v8i16:
-; CHECK-NEXT: .functype extmul_high_s_v8i16 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.extmul_high_i8x16_s $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.extmul.high.signed.v8i16(<16 x i8>, <16 x i8>)
-define <8 x i16> @extmul_high_s_v8i16(<16 x i8> %x, <16 x i8> %y) {
-  %a = call <8 x i16> @llvm.wasm.extmul.high.signed.v8i16(
-<16 x i8> %x, <16 x i8> %y
-  )
-  ret <8 x i16> %a
-}
-
-; CHECK-LABEL: extmul_low_u_v8i16:
-; CHECK-NEXT: .functype extmul_low_u_v8i16 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.extmul_low_i8x16_u $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.extmul.low.unsigned.v8i16(<16 x i8>, <16 x i8>)
-define <8 x i16> @extmul_low_u_v8i16(<16 x i8> %x, <16 x i8> %y) {
-  %a = call <8 x i16> @llvm.wasm.extmul.low.unsigned.v8i16(
-<16 x i8> %x, <16 x i8> %y
-  )
-  ret <8 x i16> %a
-}
-
-; CHECK-LABEL: extmul_high_u_v8i16:
-; CHECK-NEXT: .functype extmul_high_u_v8i16 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i16x8.extmul_high_i8x16_u $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <8 x i16> @llvm.wasm.extmul.high.unsigned.v8i16(<16 x i8>, <16 x i8>)
-define <8 x i16> @extmul_high_u_v8i16(<16 x i8> %x, <16 x i8> %y) {
-  %a = call <8 x i16> @llvm.wasm.extmul.high.unsigned.v8i16(
-<16 x i8> %x, <16 x i8> %y
-  )
-  ret <8 x i16> %a
-}
-
 ; CHECK-LABEL: extadd_pairwise_s_v8i16:
 ; CHECK-NEXT: .functype extadd_pairwise_s_v8i16 (v128) -> (v128){{$}}
 ; CHECK-NEXT: i16x8.extadd_pairwise_i8x16_s $push[[R:[0-9]+]]=, $0{{$}}
@@ -395,55 +347,6 @@
   ret <4 x i32> %a
 }
 
-
-; CHECK-LABEL: extmul_low_s_v4i32:
-; CHECK-NEXT: .functype extmul_low_s_v4i32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.extmul_low_i16x8_s $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.extmul.low.signed.v4i32(<8 x i16>, <8 x i16>)
-define <4 x i32> @extmul_low_s_v4i32(<8 x i16> %x, <8 x i16> %y) {
-  %a = call <4 x i32> @llvm.wasm.extmul.low.signed.v4i32(
-<8 x i16> %x, <8 x i16> %y
-  )
-  ret <4 x i32> %a
-}
-
-; CHECK-LABEL: extmul_high_s_v4i32:
-; CHECK-NEXT: .functype extmul_high_s_v4i32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.extmul_high_i16x8_s $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.extmul.high.signed.v4i32(<8 x i16>, <8 x i16>)
-define <4 x i32> @extmul_high_s_v4i32(<8 x i16> %x, <8 x i16> %y) {
-  %a = call <4 x i32> @llvm.wasm.extmul.high.signed.v4i32(
-<8 x i16> %x, <8 x i16> %y
-  )
-  ret <4 x i32> %a
-}
-
-; CHECK-LABEL: extmul_low_u_v4i32:
-; CHECK-NEXT: .functype extmul_low_u_v4i32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.extmul_low_i16x8_u $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x i32> @llvm.wasm.extmul.low.unsigned.v4i32(<8 x i16>, <8 x i16>)
-define <4 x i32> @extmul_low_u_v4i32(<8 x i16> %x, <8 x i16> %y) {
-  %a = call <4 x i32> @llvm.wasm.extmul.low.unsigned.v4i32(
-<8 x i16> %x, <8 x i16> %y
-  )
-  ret <4 x i32> %a
-}
-
-; CHECK-LABEL: extmul_high_u_v4i32:
-; CHECK-NEXT: .functype extmul_high_u_v4i32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: i32x4.extmul_high_i16x8_u $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return

[PATCH] D106612: [WebAssembly] Codegen for pmin and pmax

2021-07-23 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG85157c007903: [WebAssembly] Codegen for pmin and pmax 
(authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106612/new/

https://reviews.llvm.org/D106612

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-arith.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -685,26 +685,6 @@
   ret <4 x float> %a
 }
 
-; CHECK-LABEL: pmin_v4f32:
-; CHECK-NEXT: .functype pmin_v4f32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x float> @llvm.wasm.pmin.v4f32(<4 x float>, <4 x float>)
-define <4 x float> @pmin_v4f32(<4 x float> %a, <4 x float> %b) {
-  %v = call <4 x float> @llvm.wasm.pmin.v4f32(<4 x float> %a, <4 x float> %b)
-  ret <4 x float> %v
-}
-
-; CHECK-LABEL: pmax_v4f32:
-; CHECK-NEXT: .functype pmax_v4f32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: f32x4.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x float> @llvm.wasm.pmax.v4f32(<4 x float>, <4 x float>)
-define <4 x float> @pmax_v4f32(<4 x float> %a, <4 x float> %b) {
-  %v = call <4 x float> @llvm.wasm.pmax.v4f32(<4 x float> %a, <4 x float> %b)
-  ret <4 x float> %v
-}
-
 ; CHECK-LABEL: ceil_v4f32:
 ; CHECK-NEXT: .functype ceil_v4f32 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f32x4.ceil $push[[R:[0-9]+]]=, $0{{$}}
@@ -760,26 +740,6 @@
   ret <2 x double> %a
 }
 
-; CHECK-LABEL: pmin_v2f64:
-; CHECK-NEXT: .functype pmin_v2f64 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: f64x2.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <2 x double> @llvm.wasm.pmin.v2f64(<2 x double>, <2 x double>)
-define <2 x double> @pmin_v2f64(<2 x double> %a, <2 x double> %b) {
-  %v = call <2 x double> @llvm.wasm.pmin.v2f64(<2 x double> %a, <2 x double> %b)
-  ret <2 x double> %v
-}
-
-; CHECK-LABEL: pmax_v2f64:
-; CHECK-NEXT: .functype pmax_v2f64 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: f64x2.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <2 x double> @llvm.wasm.pmax.v2f64(<2 x double>, <2 x double>)
-define <2 x double> @pmax_v2f64(<2 x double> %a, <2 x double> %b) {
-  %v = call <2 x double> @llvm.wasm.pmax.v2f64(<2 x double> %a, <2 x double> %b)
-  ret <2 x double> %v
-}
-
 ; CHECK-LABEL: ceil_v2f64:
 ; CHECK-NEXT: .functype ceil_v2f64 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f64x2.ceil $push[[R:[0-9]+]]=, $0{{$}}
Index: llvm/test/CodeGen/WebAssembly/simd-arith.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-arith.ll
+++ llvm/test/CodeGen/WebAssembly/simd-arith.ll
@@ -1409,6 +1409,54 @@
   ret <4 x float> %a
 }
 
+; CHECK-LABEL: pmin_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype pmin_v4f32 (v128, v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; SIMD128-NEXT: return $pop[[R]]{{$}}
+define <4 x float> @pmin_v4f32(<4 x float> %x, <4 x float> %y) {
+  %c = fcmp olt <4 x float> %y, %x
+  %a = select <4 x i1> %c, <4 x float> %y, <4 x float> %x
+  ret <4 x float> %a
+}
+
+; CHECK-LABEL: pmin_int_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype pmin_int_v4f32 (v128, v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; SIMD128-NEXT: return $pop[[R]]{{$}}
+define <4 x i32> @pmin_int_v4f32(<4 x i32> %x, <4 x i32> %y) {
+  %fx = bitcast <4 x i32> %x to <4 x float>
+  %fy = bitcast <4 x i32> %y to <4 x float>
+  %c = fcmp olt <4 x float> %fy, %fx
+  %a = select <4 x i1> %c, <4 x i32> %y, <4 x i32> %x
+  ret <4 x i32> %a
+}
+
+; CHECK-LABEL: pmax_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype pmax_v4f32 (v128, v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; SIMD128-NEXT: return $pop[[R]]{{$}}
+define <4 x float> @pmax_v4f32(<4 x float> %x, <4 x float> %y) {
+  %c = fcmp olt <4 x float> %x, %y
+  %a = select <4 x i1> %c, <4 x float> %y, <4 x float> %x
+  ret <4 x float> %a
+}
+
+; CHECK-LABEL: pmax_int_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype pmax_int_v4f32 (v128, v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; SIMD128-NEXT: return $pop[[R]]{{$}}
+define <4 x i32> @pmax_int_v4f32(<4 x i32> %x, <4 x i32> %y) {
+  %fx = bitcast <4 x i32> %x to

[PATCH] D106612: [WebAssembly] Codegen for pmin and pmax

2021-07-23 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td:1141
+def : Pat<(vec.int_vt (vselect
+(setolt (vec.vt (bitconvert V128:$rhs)),
+(vec.vt (bitconvert V128:$lhs))),

aheejin wrote:
> Sorry I asked this in person yesterday, but I don't think I quiet got it; why 
> is this only for the ordered comparison? And does pseudo-min/max mean 
> non-NaN-propagating?
pseudo-min is defined as `b < a ? b : a` and pseudo-max is defined as `a < b ? 
b : a`. Both of these definitions use `<` as the float comparison operator, 
which for WebAssembly means OLT.

For both pseudo-min and pseudo-max, if `a` or `b` is NaN, the comparison will 
be `false` and the result will be `a`. So when `a` is NaN but `b` is not, these 
instructions are NaN-propagating. But when `b` is NaN and `a` is not, these 
instructions are not NaN-propagating. In contrast, the normal min and max 
operations propagate NaNs in either operand position.



Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td:1145-1149
+def : Pat<(vec.int_vt (vselect
+(setolt (vec.vt (bitconvert V128:$lhs)),
+(vec.vt (bitconvert V128:$rhs))),
+V128:$rhs, V128:$lhs)),
+  (pmax $lhs, $rhs)>;

aheejin wrote:
> Is it OK to change the return type? The source pattern's return type is an 
> int vector but the resulting instruction's type is a float vector.
Yes, this is the final step of DAG selection and we are lowering the SDNodes to 
TargetSDNodes here, so typing no longer applies.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106612/new/

https://reviews.llvm.org/D106612

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106611: [WebAssembly][NFC] Update test expectations labels after db7efcab7dd9

2021-07-22 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG481084f669e1: [WebAssembly][NFC] Update test expectations 
labels after db7efcab7dd9 (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106611/new/

https://reviews.llvm.org/D106611

Files:
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -458,8 +458,8 @@
 // CHECK-LABEL: @test_i8x16_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <16 x i8> [[TMP0]], i32 15
-// CHECK-NEXT:ret i8 [[TMP1]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <16 x i8> [[TMP0]], i32 15
+// CHECK-NEXT:ret i8 [[VECEXT_I]]
 //
 int8_t test_i8x16_extract_lane(v128_t a) {
   return wasm_i8x16_extract_lane(a, 15);
@@ -468,8 +468,8 @@
 // CHECK-LABEL: @test_u8x16_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <16 x i8> [[TMP0]], i32 15
-// CHECK-NEXT:ret i8 [[TMP1]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <16 x i8> [[TMP0]], i32 15
+// CHECK-NEXT:ret i8 [[VECEXT_I]]
 //
 uint8_t test_u8x16_extract_lane(v128_t a) {
   return wasm_u8x16_extract_lane(a, 15);
@@ -478,9 +478,9 @@
 // CHECK-LABEL: @test_i8x16_replace_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[TMP1:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[B:%.*]], i32 15
-// CHECK-NEXT:[[TMP2:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP2]]
+// CHECK-NEXT:[[VECINS_I:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[B:%.*]], i32 15
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <16 x i8> [[VECINS_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
 v128_t test_i8x16_replace_lane(v128_t a, int8_t b) {
   return wasm_i8x16_replace_lane(a, 15, b);
@@ -500,8 +500,8 @@
 // CHECK-LABEL: @test_i16x8_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <8 x i16> [[TMP0]], i32 7
-// CHECK-NEXT:ret i16 [[TMP1]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <8 x i16> [[TMP0]], i32 7
+// CHECK-NEXT:ret i16 [[VECEXT_I]]
 //
 int16_t test_i16x8_extract_lane(v128_t a) {
   return wasm_i16x8_extract_lane(a, 7);
@@ -510,8 +510,8 @@
 // CHECK-LABEL: @test_u16x8_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <8 x i16> [[TMP0]], i32 7
-// CHECK-NEXT:ret i16 [[TMP1]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <8 x i16> [[TMP0]], i32 7
+// CHECK-NEXT:ret i16 [[VECEXT_I]]
 //
 uint16_t test_u16x8_extract_lane(v128_t a) {
   return wasm_u16x8_extract_lane(a, 7);
@@ -520,9 +520,9 @@
 // CHECK-LABEL: @test_i16x8_replace_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
-// CHECK-NEXT:[[TMP1:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[B:%.*]], i32 7
-// CHECK-NEXT:[[TMP2:%.*]] = bitcast <8 x i16> [[TMP1]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP2]]
+// CHECK-NEXT:[[VECINS_I:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[B:%.*]], i32 7
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <8 x i16> [[VECINS_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
 v128_t test_i16x8_replace_lane(v128_t a, int16_t b) {
   return wasm_i16x8_replace_lane(a, 7, b);
@@ -540,8 +540,8 @@
 
 // CHECK-LABEL: @test_i32x4_extract_lane(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = extractelement <4 x i32> [[A:%.*]], i32 3
-// CHECK-NEXT:ret i32 [[TMP0]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <4 x i32> [[A:%.*]], i32 3
+// CHECK-NEXT:ret i32 [[VECEXT_I]]
 //
 int32_t test_i32x4_extract_lane(v128_t a) {
   return wasm_i32x4_extract_lane(a, 3);
@@ -549,8 +549,8 @@
 
 // CHECK-LABEL: @test_i32x4_replace_lane(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x i32> [[A:%.*]], i32 [[B:%.*]], i32 3
-// CHECK-NEXT:ret <4 x i32> [[TMP0]]
+// CHECK-NEXT:[[VECINS_I:%.*]] = insertelement <4 x i32> [[A:%.*]], i32 [[B:%.*]], i32 3
+// CHECK-NEXT:ret <4 x i32> [[VECINS_I]]
 //
 v128_t test_i32x4_replace_lane(v128_t a, int32_t b) {
   return wasm_i32x4_replace_lane(a, 3, b);
@@ -570,8 +570,8 @@
 // CHECK-LABEL: @test_i64x2_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <2 x i64>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <2 x i64> [[TMP0]],

[PATCH] D106612: [WebAssembly] Codegen for pmin and pmax

2021-07-22 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax}
with standard codegen patterns. Since wasm_simd128.h uses an integer vector as
the standard single vector type, the IR for the pmin and pmax intrinsic
functions contains bitcasts that would not be there otherwise. Add extra codegen
patterns that can still select the pmin and pmax instructions in the presence of
these bitcasts.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106612

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-arith.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -685,26 +685,6 @@
   ret <4 x float> %a
 }
 
-; CHECK-LABEL: pmin_v4f32:
-; CHECK-NEXT: .functype pmin_v4f32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x float> @llvm.wasm.pmin.v4f32(<4 x float>, <4 x float>)
-define <4 x float> @pmin_v4f32(<4 x float> %a, <4 x float> %b) {
-  %v = call <4 x float> @llvm.wasm.pmin.v4f32(<4 x float> %a, <4 x float> %b)
-  ret <4 x float> %v
-}
-
-; CHECK-LABEL: pmax_v4f32:
-; CHECK-NEXT: .functype pmax_v4f32 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: f32x4.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x float> @llvm.wasm.pmax.v4f32(<4 x float>, <4 x float>)
-define <4 x float> @pmax_v4f32(<4 x float> %a, <4 x float> %b) {
-  %v = call <4 x float> @llvm.wasm.pmax.v4f32(<4 x float> %a, <4 x float> %b)
-  ret <4 x float> %v
-}
-
 ; CHECK-LABEL: ceil_v4f32:
 ; CHECK-NEXT: .functype ceil_v4f32 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f32x4.ceil $push[[R:[0-9]+]]=, $0{{$}}
@@ -760,26 +740,6 @@
   ret <2 x double> %a
 }
 
-; CHECK-LABEL: pmin_v2f64:
-; CHECK-NEXT: .functype pmin_v2f64 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: f64x2.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <2 x double> @llvm.wasm.pmin.v2f64(<2 x double>, <2 x double>)
-define <2 x double> @pmin_v2f64(<2 x double> %a, <2 x double> %b) {
-  %v = call <2 x double> @llvm.wasm.pmin.v2f64(<2 x double> %a, <2 x double> %b)
-  ret <2 x double> %v
-}
-
-; CHECK-LABEL: pmax_v2f64:
-; CHECK-NEXT: .functype pmax_v2f64 (v128, v128) -> (v128){{$}}
-; CHECK-NEXT: f64x2.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <2 x double> @llvm.wasm.pmax.v2f64(<2 x double>, <2 x double>)
-define <2 x double> @pmax_v2f64(<2 x double> %a, <2 x double> %b) {
-  %v = call <2 x double> @llvm.wasm.pmax.v2f64(<2 x double> %a, <2 x double> %b)
-  ret <2 x double> %v
-}
-
 ; CHECK-LABEL: ceil_v2f64:
 ; CHECK-NEXT: .functype ceil_v2f64 (v128) -> (v128){{$}}
 ; CHECK-NEXT: f64x2.ceil $push[[R:[0-9]+]]=, $0{{$}}
Index: llvm/test/CodeGen/WebAssembly/simd-arith.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-arith.ll
+++ llvm/test/CodeGen/WebAssembly/simd-arith.ll
@@ -1409,6 +1409,54 @@
   ret <4 x float> %a
 }
 
+; CHECK-LABEL: pmin_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype pmin_v4f32 (v128, v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; SIMD128-NEXT: return $pop[[R]]{{$}}
+define <4 x float> @pmin_v4f32(<4 x float> %x, <4 x float> %y) {
+  %c = fcmp olt <4 x float> %y, %x
+  %a = select <4 x i1> %c, <4 x float> %y, <4 x float> %x
+  ret <4 x float> %a
+}
+
+; CHECK-LABEL: pmin_int_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype pmin_int_v4f32 (v128, v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.pmin $push[[R:[0-9]+]]=, $0, $1{{$}}
+; SIMD128-NEXT: return $pop[[R]]{{$}}
+define <4 x i32> @pmin_int_v4f32(<4 x i32> %x, <4 x i32> %y) {
+  %fx = bitcast <4 x i32> %x to <4 x float>
+  %fy = bitcast <4 x i32> %y to <4 x float>
+  %c = fcmp olt <4 x float> %fy, %fx
+  %a = select <4 x i1> %c, <4 x i32> %y, <4 x i32> %x
+  ret <4 x i32> %a
+}
+
+; CHECK-LABEL: pmax_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype pmax_v4f32 (v128, v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.pmax $push[[R:[0-9]+]]=, $0, $1{{$}}
+; SIMD128-NEXT: return $pop[[R]]{{$}}
+define <4 x float> @pmax_v4f32(<4 x float> %x, <4 x float> %y) {
+  %c = fcmp olt <4 x float> %x,

[PATCH] D106611: [WebAssembly][NFC] Update test expectations labels after db7efcab7dd9

2021-07-22 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added a reviewer: aheejin.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, sbc100, 
dschuff.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Commit db7efcab7dd9 
 changed 
the implementations of the wasm_*_extract_lane and
wasm_*_replace_lane intrinsics from using builtin functions to using the
standard vector extensions. This did not change the resulting IR, but it changes
how update_cc_test_checks.py labels values in the IR. This commit simply updates
those labels.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106611

Files:
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -458,8 +458,8 @@
 // CHECK-LABEL: @test_i8x16_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <16 x i8> [[TMP0]], i32 15
-// CHECK-NEXT:ret i8 [[TMP1]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <16 x i8> [[TMP0]], i32 15
+// CHECK-NEXT:ret i8 [[VECEXT_I]]
 //
 int8_t test_i8x16_extract_lane(v128_t a) {
   return wasm_i8x16_extract_lane(a, 15);
@@ -468,8 +468,8 @@
 // CHECK-LABEL: @test_u8x16_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <16 x i8> [[TMP0]], i32 15
-// CHECK-NEXT:ret i8 [[TMP1]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <16 x i8> [[TMP0]], i32 15
+// CHECK-NEXT:ret i8 [[VECEXT_I]]
 //
 uint8_t test_u8x16_extract_lane(v128_t a) {
   return wasm_u8x16_extract_lane(a, 15);
@@ -478,9 +478,9 @@
 // CHECK-LABEL: @test_i8x16_replace_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[TMP1:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[B:%.*]], i32 15
-// CHECK-NEXT:[[TMP2:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP2]]
+// CHECK-NEXT:[[VECINS_I:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[B:%.*]], i32 15
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <16 x i8> [[VECINS_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
 v128_t test_i8x16_replace_lane(v128_t a, int8_t b) {
   return wasm_i8x16_replace_lane(a, 15, b);
@@ -500,8 +500,8 @@
 // CHECK-LABEL: @test_i16x8_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <8 x i16> [[TMP0]], i32 7
-// CHECK-NEXT:ret i16 [[TMP1]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <8 x i16> [[TMP0]], i32 7
+// CHECK-NEXT:ret i16 [[VECEXT_I]]
 //
 int16_t test_i16x8_extract_lane(v128_t a) {
   return wasm_i16x8_extract_lane(a, 7);
@@ -510,8 +510,8 @@
 // CHECK-LABEL: @test_u16x8_extract_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
-// CHECK-NEXT:[[TMP1:%.*]] = extractelement <8 x i16> [[TMP0]], i32 7
-// CHECK-NEXT:ret i16 [[TMP1]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <8 x i16> [[TMP0]], i32 7
+// CHECK-NEXT:ret i16 [[VECEXT_I]]
 //
 uint16_t test_u16x8_extract_lane(v128_t a) {
   return wasm_u16x8_extract_lane(a, 7);
@@ -520,9 +520,9 @@
 // CHECK-LABEL: @test_i16x8_replace_lane(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
-// CHECK-NEXT:[[TMP1:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[B:%.*]], i32 7
-// CHECK-NEXT:[[TMP2:%.*]] = bitcast <8 x i16> [[TMP1]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP2]]
+// CHECK-NEXT:[[VECINS_I:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[B:%.*]], i32 7
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <8 x i16> [[VECINS_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
 v128_t test_i16x8_replace_lane(v128_t a, int16_t b) {
   return wasm_i16x8_replace_lane(a, 7, b);
@@ -540,8 +540,8 @@
 
 // CHECK-LABEL: @test_i32x4_extract_lane(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = extractelement <4 x i32> [[A:%.*]], i32 3
-// CHECK-NEXT:ret i32 [[TMP0]]
+// CHECK-NEXT:[[VECEXT_I:%.*]] = extractelement <4 x i32> [[A:%.*]], i32 3
+// CHECK-NEXT:ret i32 [[VECEXT_I]]
 //
 int32_t test_i32x4_extract_lane(v128_t a) {
   return wasm_i32x4_extract_lane(a, 3);
@@ -549,8 +549,8 @@
 
 // CHECK-LABEL: @test_i32x4_replace_lane(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x i32> [[A:%.*]], i32 [[B:%.*]], i32 3
-// CHECK-NEXT:ret <4 x i32> [[TMP0]]
+// CHECK-NEXT:[[VECINS_I:%.*]] = insertelement <4 x i32> [[A:%.*]], i32 [[B:%.*]], i32 3
+// CHECK-NEXT:ret <4 x i32>

[PATCH] D106506: [WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8

2021-07-21 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8af333cf1a77: [WebAssembly] Replace @llvm.wasm.popcnt with 
@llvm.ctpop.v16i8 (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106506/new/

https://reviews.llvm.org/D106506

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/CodeGen/WebAssembly/simd-unsupported.ll

Index: llvm/test/CodeGen/WebAssembly/simd-unsupported.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-unsupported.ll
+++ llvm/test/CodeGen/WebAssembly/simd-unsupported.ll
@@ -10,7 +10,7 @@
 ; ==
 
 ; CHECK-LABEL: ctlz_v16i8:
-; CHECK: i32.clz
+; CHECK: i8x16.popcnt
 declare <16 x i8> @llvm.ctlz.v16i8(<16 x i8>, i1)
 define <16 x i8> @ctlz_v16i8(<16 x i8> %x) {
   %v = call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> %x, i1 false)
@@ -18,14 +18,14 @@
 }
 
 ; CHECK-LABEL: ctlz_v16i8_undef:
-; CHECK: i32.clz
+; CHECK: i8x16.popcnt
 define <16 x i8> @ctlz_v16i8_undef(<16 x i8> %x) {
   %v = call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> %x, i1 true)
   ret <16 x i8> %v
 }
 
 ; CHECK-LABEL: cttz_v16i8:
-; CHECK: i32.ctz
+; CHECK: i8x16.popcnt
 declare <16 x i8> @llvm.cttz.v16i8(<16 x i8>, i1)
 define <16 x i8> @cttz_v16i8(<16 x i8> %x) {
   %v = call <16 x i8> @llvm.cttz.v16i8(<16 x i8> %x, i1 false)
@@ -33,21 +33,12 @@
 }
 
 ; CHECK-LABEL: cttz_v16i8_undef:
-; CHECK: i32.ctz
+; CHECK: i8x16.popcnt
 define <16 x i8> @cttz_v16i8_undef(<16 x i8> %x) {
   %v = call <16 x i8> @llvm.cttz.v16i8(<16 x i8> %x, i1 true)
   ret <16 x i8> %v
 }
 
-; CHECK-LABEL: ctpop_v16i8:
-; Note: expansion does not use i32.popcnt
-; CHECK: v128.and
-declare <16 x i8> @llvm.ctpop.v16i8(<16 x i8>)
-define <16 x i8> @ctpop_v16i8(<16 x i8> %x) {
-  %v = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> %x)
-  ret <16 x i8> %v
-}
-
 ; CHECK-LABEL: sdiv_v16i8:
 ; CHECK: i32.div_s
 define <16 x i8> @sdiv_v16i8(<16 x i8> %x, <16 x i8> %y) {
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -78,9 +78,9 @@
 ; CHECK-NEXT: .functype popcnt_v16i8 (v128) -> (v128){{$}}
 ; CHECK-NEXT: i8x16.popcnt $push[[R:[0-9]+]]=, $0{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <16 x i8> @llvm.wasm.popcnt(<16 x i8>)
+declare <16 x i8> @llvm.ctpop.v16i8(<16 x i8>)
 define <16 x i8> @popcnt_v16i8(<16 x i8> %x) {
- %a = call <16 x i8> @llvm.wasm.popcnt(<16 x i8> %x)
+ %a = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> %x)
  ret <16 x i8> %a
 }
 
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -833,7 +833,7 @@
 defm NEG : SIMDUnaryInt;
 
 // Population count: popcnt
-defm POPCNT : SIMDUnary;
+defm POPCNT : SIMDUnary;
 
 // Any lane true: any_true
 defm ANYTRUE : SIMD_I<(outs I32:$dst), (ins V128:$vec), (outs), (ins), [],
Index: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
===
--- llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -212,6 +212,9 @@
   for (auto T : {MVT::v16i8, MVT::v8i16, MVT::v4i32})
 setOperationAction(Op, T, Legal);
 
+// And we have popcnt for i8x16
+setOperationAction(ISD::CTPOP, MVT::v16i8, Legal);
+
 // Expand float operations supported for scalars but not SIMD
 for (auto Op : {ISD::FCOPYSIGN, ISD::FLOG, ISD::FLOG2, ISD::FLOG10,
 ISD::FEXP, ISD::FEXP2, ISD::FRINT})
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -172,11 +172,6 @@
 [LLVMMatchType<0>, LLVMMatchType<0>],
 [IntrNoMem, IntrSpeculatable]>;
 
-// TODO: Replace this intrinsic with normal ISel patterns once popcnt is merged
-// to the proposal.
-def int_wasm_popcnt :
-  Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty], [IntrNoMem, IntrSpeculatable]>;
-
 def int_wasm_extmul_low_signed :
   Intrinsic<[llvm_anyvector_ty],
 [LLVMSubdivide2VectorType<0>, LLVMSubdivide2VectorType<0>],
Index: clang/test/Headers/wasm.c

[PATCH] D106506: [WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8

2021-07-21 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: llvm/test/CodeGen/WebAssembly/simd-unsupported.ll:13
 ; CHECK-LABEL: ctlz_v16i8:
-; CHECK: i32.clz
+; CHECK: i8x16.popcnt
 declare <16 x i8> @llvm.ctlz.v16i8(<16 x i8>, i1)

aheejin wrote:
> Aren't these now... "supported"? (This file's name is simd-unsupported.ll, 
> so...)
No, I would say these are still unsupported because they don't map down to a 
single SIMD instruction. The i8x16.popcnt is still just part of a rather large 
expansion of the operation. Having i8x16.popcnt available makes the expansion 
better than before, though.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106506/new/

https://reviews.llvm.org/D106506

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106500: [WebAssembly] Remove clang builtins for extract_lane and replace_lane

2021-07-21 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGdb7efcab7dd9: [WebAssembly] Remove clang builtins for 
extract_lane and replace_lane (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106500/new/

https://reviews.llvm.org/D106500

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c

Index: clang/test/CodeGen/builtins-wasm.c
===
--- clang/test/CodeGen/builtins-wasm.c
+++ clang/test/CodeGen/builtins-wasm.c
@@ -193,99 +193,9 @@
   // WEBASSEMBLY-NEXT: ret
 }
 
-int extract_lane_s_i8x16(i8x16 v) {
-  return __builtin_wasm_extract_lane_s_i8x16(v, 13);
-  // MISSING-SIMD: error: '__builtin_wasm_extract_lane_s_i8x16' needs target feature simd128
-  // WEBASSEMBLY: extractelement <16 x i8> %v, i32 13
-  // WEBASSEMBLY-NEXT: sext
-  // WEBASSEMBLY-NEXT: ret
-}
-
-int extract_lane_u_i8x16(u8x16 v) {
-  return __builtin_wasm_extract_lane_u_i8x16(v, 13);
-  // WEBASSEMBLY: extractelement <16 x i8> %v, i32 13
-  // WEBASSEMBLY-NEXT: zext
-  // WEBASSEMBLY-NEXT: ret
-}
-
-int extract_lane_s_i16x8(i16x8 v) {
-  return __builtin_wasm_extract_lane_s_i16x8(v, 7);
-  // WEBASSEMBLY: extractelement <8 x i16> %v, i32 7
-  // WEBASSEMBLY-NEXT: sext
-  // WEBASSEMBLY-NEXT: ret
-}
-
-int extract_lane_u_i16x8(u16x8 v) {
-  return __builtin_wasm_extract_lane_u_i16x8(v, 7);
-  // WEBASSEMBLY: extractelement <8 x i16> %v, i32 7
-  // WEBASSEMBLY-NEXT: zext
-  // WEBASSEMBLY-NEXT: ret
-}
-
-int extract_lane_i32x4(i32x4 v) {
-  return __builtin_wasm_extract_lane_i32x4(v, 3);
-  // WEBASSEMBLY: extractelement <4 x i32> %v, i32 3
-  // WEBASSEMBLY-NEXT: ret
-}
-
-long long extract_lane_i64x2(i64x2 v) {
-  return __builtin_wasm_extract_lane_i64x2(v, 1);
-  // WEBASSEMBLY: extractelement <2 x i64> %v, i32 1
-  // WEBASSEMBLY-NEXT: ret
-}
-
-float extract_lane_f32x4(f32x4 v) {
-  return __builtin_wasm_extract_lane_f32x4(v, 3);
-  // WEBASSEMBLY: extractelement <4 x float> %v, i32 3
-  // WEBASSEMBLY-NEXT: ret
-}
-
-double extract_lane_f64x2(f64x2 v) {
-  return __builtin_wasm_extract_lane_f64x2(v, 1);
-  // WEBASSEMBLY: extractelement <2 x double> %v, i32 1
-  // WEBASSEMBLY-NEXT: ret
-}
-
-i8x16 replace_lane_i8x16(i8x16 v, int x) {
-  return __builtin_wasm_replace_lane_i8x16(v, 13, x);
-  // WEBASSEMBLY: trunc i32 %x to i8
-  // WEBASSEMBLY-NEXT: insertelement <16 x i8> %v, i8 %{{.*}}, i32 13
-  // WEBASSEMBLY-NEXT: ret
-}
-
-i16x8 replace_lane_i16x8(i16x8 v, int x) {
-  return __builtin_wasm_replace_lane_i16x8(v, 7, x);
-  // WEBASSEMBLY: trunc i32 %x to i16
-  // WEBASSEMBLY-NEXT: insertelement <8 x i16> %v, i16 %{{.*}}, i32 7
-  // WEBASSEMBLY-NEXT: ret
-}
-
-i32x4 replace_lane_i32x4(i32x4 v, int x) {
-  return __builtin_wasm_replace_lane_i32x4(v, 3, x);
-  // WEBASSEMBLY: insertelement <4 x i32> %v, i32 %x, i32 3
-  // WEBASSEMBLY-NEXT: ret
-}
-
-i64x2 replace_lane_i64x2(i64x2 v, long long x) {
-  return __builtin_wasm_replace_lane_i64x2(v, 1, x);
-  // WEBASSEMBLY: insertelement <2 x i64> %v, i64 %x, i32 1
-  // WEBASSEMBLY-NEXT: ret
-}
-
-f32x4 replace_lane_f32x4(f32x4 v, float x) {
-  return __builtin_wasm_replace_lane_f32x4(v, 3, x);
-  // WEBASSEMBLY: insertelement <4 x float> %v, float %x, i32 3
-  // WEBASSEMBLY-NEXT: ret
-}
-
-f64x2 replace_lane_f64x2(f64x2 v, double x) {
-  return __builtin_wasm_replace_lane_f64x2(v, 1, x);
-  // WEBASSEMBLY: insertelement <2 x double> %v, double %x, i32 1
-  // WEBASSEMBLY-NEXT: ret
-}
-
 i8x16 add_sat_s_i8x16(i8x16 x, i8x16 y) {
   return __builtin_wasm_add_sat_s_i8x16(x, y);
+  // MISSING-SIMD: error: '__builtin_wasm_add_sat_s_i8x16' needs target feature simd128
   // WEBASSEMBLY: call <16 x i8> @llvm.sadd.sat.v16i8(
   // WEBASSEMBLY-SAME: <16 x i8> %x, <16 x i8> %y)
   // WEBASSEMBLY-NEXT: ret
Index: clang/lib/Headers/wasm_simd128.h
===
--- clang/lib/Headers/wasm_simd128.h
+++ clang/lib/Headers/wasm_simd128.h
@@ -396,67 +396,126 @@
__a, __a, __a, __a, __a, __a, __a, __a};
 }
 
-#define wasm_i8x16_extract_lane(__a, __i)  \
-  (__builtin_wasm_extract_lane_s_i8x16((__i8x16)(__a), __i))
+static __inline__ int8_t __DEFAULT_FN_ATTRS wasm_i8x16_extract_lane(v128_t __a,
+int __i)
+__REQUIRE_CONSTANT(__i) {
+  return ((__i8x16)__a)[__i];
+}
 
-#define wasm_u8x16_extract_lane(__a, __i)  \
-  (__builtin_wasm_extract_lane_u_i8x16((__u8x16)(__a), __i))
+static __inline__ uint8_t __DEFAULT_FN_ATTRS wasm_u8x16_extract_lane(v128_t __a,
+ int __i)
+

[PATCH] D106506: [WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8

2021-07-21 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Use the standard target-independent intrinsic to take advantage of standard
optimizations.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106506

Files:
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
  llvm/test/CodeGen/WebAssembly/simd-unsupported.ll

Index: llvm/test/CodeGen/WebAssembly/simd-unsupported.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-unsupported.ll
+++ llvm/test/CodeGen/WebAssembly/simd-unsupported.ll
@@ -10,7 +10,7 @@
 ; ==
 
 ; CHECK-LABEL: ctlz_v16i8:
-; CHECK: i32.clz
+; CHECK: i8x16.popcnt
 declare <16 x i8> @llvm.ctlz.v16i8(<16 x i8>, i1)
 define <16 x i8> @ctlz_v16i8(<16 x i8> %x) {
   %v = call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> %x, i1 false)
@@ -18,14 +18,14 @@
 }
 
 ; CHECK-LABEL: ctlz_v16i8_undef:
-; CHECK: i32.clz
+; CHECK: i8x16.popcnt
 define <16 x i8> @ctlz_v16i8_undef(<16 x i8> %x) {
   %v = call <16 x i8> @llvm.ctlz.v16i8(<16 x i8> %x, i1 true)
   ret <16 x i8> %v
 }
 
 ; CHECK-LABEL: cttz_v16i8:
-; CHECK: i32.ctz
+; CHECK: i8x16.popcnt
 declare <16 x i8> @llvm.cttz.v16i8(<16 x i8>, i1)
 define <16 x i8> @cttz_v16i8(<16 x i8> %x) {
   %v = call <16 x i8> @llvm.cttz.v16i8(<16 x i8> %x, i1 false)
@@ -33,21 +33,12 @@
 }
 
 ; CHECK-LABEL: cttz_v16i8_undef:
-; CHECK: i32.ctz
+; CHECK: i8x16.popcnt
 define <16 x i8> @cttz_v16i8_undef(<16 x i8> %x) {
   %v = call <16 x i8> @llvm.cttz.v16i8(<16 x i8> %x, i1 true)
   ret <16 x i8> %v
 }
 
-; CHECK-LABEL: ctpop_v16i8:
-; Note: expansion does not use i32.popcnt
-; CHECK: v128.and
-declare <16 x i8> @llvm.ctpop.v16i8(<16 x i8>)
-define <16 x i8> @ctpop_v16i8(<16 x i8> %x) {
-  %v = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> %x)
-  ret <16 x i8> %v
-}
-
 ; CHECK-LABEL: sdiv_v16i8:
 ; CHECK: i32.div_s
 define <16 x i8> @sdiv_v16i8(<16 x i8> %x, <16 x i8> %y) {
Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -78,9 +78,9 @@
 ; CHECK-NEXT: .functype popcnt_v16i8 (v128) -> (v128){{$}}
 ; CHECK-NEXT: i8x16.popcnt $push[[R:[0-9]+]]=, $0{{$}}
 ; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <16 x i8> @llvm.wasm.popcnt(<16 x i8>)
+declare <16 x i8> @llvm.ctpop.v16i8(<16 x i8>)
 define <16 x i8> @popcnt_v16i8(<16 x i8> %x) {
- %a = call <16 x i8> @llvm.wasm.popcnt(<16 x i8> %x)
+ %a = call <16 x i8> @llvm.ctpop.v16i8(<16 x i8> %x)
  ret <16 x i8> %a
 }
 
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -833,7 +833,7 @@
 defm NEG : SIMDUnaryInt;
 
 // Population count: popcnt
-defm POPCNT : SIMDUnary;
+defm POPCNT : SIMDUnary;
 
 // Any lane true: any_true
 defm ANYTRUE : SIMD_I<(outs I32:$dst), (ins V128:$vec), (outs), (ins), [],
Index: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
===
--- llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -212,6 +212,9 @@
   for (auto T : {MVT::v16i8, MVT::v8i16, MVT::v4i32})
 setOperationAction(Op, T, Legal);
 
+// And we have popcnt for i8x16
+setOperationAction(ISD::CTPOP, MVT::v16i8, Legal);
+
 // Expand float operations supported for scalars but not SIMD
 for (auto Op : {ISD::FCOPYSIGN, ISD::FLOG, ISD::FLOG2, ISD::FLOG10,
 ISD::FEXP, ISD::FEXP2, ISD::FRINT})
Index: llvm/include/llvm/IR/IntrinsicsWebAssembly.td
===
--- llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -172,11 +172,6 @@
 [LLVMMatchType<0>, LLVMMatchType<0>],
 [IntrNoMem, IntrSpeculatable]>;
 
-// TODO: Replace this intrinsic with normal ISel patterns once popcnt is merged
-// to the proposal.
-def int_wasm_popcnt :
-  Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty], [IntrNoMem, IntrSpeculatable]>;
-
 def int_wasm_extmul_low_signed :
   Intrinsic<[llvm_anyvector_ty],
 [LLVMSubdivide2VectorType<0>, LLVMSubdivide2VectorType<0>],
Index:

[PATCH] D106500: [WebAssembly] Remove clang builtins for extract_lane and replace_lane

2021-07-21 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

These builtins were added to capture the fact that the underlying Wasm
instructions return i32s and implicitly sign or zero extend the extracted lanes
in the case of the i8x16 and i16x8 variants. But we do sufficient optimizations
during code gen that these low-level details do not need to be exposed to users.

This commit replaces the use of the builtins in wasm_simd128.h with normal
target-independent vector code. As a result, we can switch the relevant
intrinsics to use functions rather than macros and can use more user-friendly
return types rather than trying to precisely expose the underlying Wasm types.
Note, however, that the generated LLVM IR is no different after this change.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106500

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c

Index: clang/test/CodeGen/builtins-wasm.c
===
--- clang/test/CodeGen/builtins-wasm.c
+++ clang/test/CodeGen/builtins-wasm.c
@@ -193,99 +193,9 @@
   // WEBASSEMBLY-NEXT: ret
 }
 
-int extract_lane_s_i8x16(i8x16 v) {
-  return __builtin_wasm_extract_lane_s_i8x16(v, 13);
-  // MISSING-SIMD: error: '__builtin_wasm_extract_lane_s_i8x16' needs target feature simd128
-  // WEBASSEMBLY: extractelement <16 x i8> %v, i32 13
-  // WEBASSEMBLY-NEXT: sext
-  // WEBASSEMBLY-NEXT: ret
-}
-
-int extract_lane_u_i8x16(u8x16 v) {
-  return __builtin_wasm_extract_lane_u_i8x16(v, 13);
-  // WEBASSEMBLY: extractelement <16 x i8> %v, i32 13
-  // WEBASSEMBLY-NEXT: zext
-  // WEBASSEMBLY-NEXT: ret
-}
-
-int extract_lane_s_i16x8(i16x8 v) {
-  return __builtin_wasm_extract_lane_s_i16x8(v, 7);
-  // WEBASSEMBLY: extractelement <8 x i16> %v, i32 7
-  // WEBASSEMBLY-NEXT: sext
-  // WEBASSEMBLY-NEXT: ret
-}
-
-int extract_lane_u_i16x8(u16x8 v) {
-  return __builtin_wasm_extract_lane_u_i16x8(v, 7);
-  // WEBASSEMBLY: extractelement <8 x i16> %v, i32 7
-  // WEBASSEMBLY-NEXT: zext
-  // WEBASSEMBLY-NEXT: ret
-}
-
-int extract_lane_i32x4(i32x4 v) {
-  return __builtin_wasm_extract_lane_i32x4(v, 3);
-  // WEBASSEMBLY: extractelement <4 x i32> %v, i32 3
-  // WEBASSEMBLY-NEXT: ret
-}
-
-long long extract_lane_i64x2(i64x2 v) {
-  return __builtin_wasm_extract_lane_i64x2(v, 1);
-  // WEBASSEMBLY: extractelement <2 x i64> %v, i32 1
-  // WEBASSEMBLY-NEXT: ret
-}
-
-float extract_lane_f32x4(f32x4 v) {
-  return __builtin_wasm_extract_lane_f32x4(v, 3);
-  // WEBASSEMBLY: extractelement <4 x float> %v, i32 3
-  // WEBASSEMBLY-NEXT: ret
-}
-
-double extract_lane_f64x2(f64x2 v) {
-  return __builtin_wasm_extract_lane_f64x2(v, 1);
-  // WEBASSEMBLY: extractelement <2 x double> %v, i32 1
-  // WEBASSEMBLY-NEXT: ret
-}
-
-i8x16 replace_lane_i8x16(i8x16 v, int x) {
-  return __builtin_wasm_replace_lane_i8x16(v, 13, x);
-  // WEBASSEMBLY: trunc i32 %x to i8
-  // WEBASSEMBLY-NEXT: insertelement <16 x i8> %v, i8 %{{.*}}, i32 13
-  // WEBASSEMBLY-NEXT: ret
-}
-
-i16x8 replace_lane_i16x8(i16x8 v, int x) {
-  return __builtin_wasm_replace_lane_i16x8(v, 7, x);
-  // WEBASSEMBLY: trunc i32 %x to i16
-  // WEBASSEMBLY-NEXT: insertelement <8 x i16> %v, i16 %{{.*}}, i32 7
-  // WEBASSEMBLY-NEXT: ret
-}
-
-i32x4 replace_lane_i32x4(i32x4 v, int x) {
-  return __builtin_wasm_replace_lane_i32x4(v, 3, x);
-  // WEBASSEMBLY: insertelement <4 x i32> %v, i32 %x, i32 3
-  // WEBASSEMBLY-NEXT: ret
-}
-
-i64x2 replace_lane_i64x2(i64x2 v, long long x) {
-  return __builtin_wasm_replace_lane_i64x2(v, 1, x);
-  // WEBASSEMBLY: insertelement <2 x i64> %v, i64 %x, i32 1
-  // WEBASSEMBLY-NEXT: ret
-}
-
-f32x4 replace_lane_f32x4(f32x4 v, float x) {
-  return __builtin_wasm_replace_lane_f32x4(v, 3, x);
-  // WEBASSEMBLY: insertelement <4 x float> %v, float %x, i32 3
-  // WEBASSEMBLY-NEXT: ret
-}
-
-f64x2 replace_lane_f64x2(f64x2 v, double x) {
-  return __builtin_wasm_replace_lane_f64x2(v, 1, x);
-  // WEBASSEMBLY: insertelement <2 x double> %v, double %x, i32 1
-  // WEBASSEMBLY-NEXT: ret
-}
-
 i8x16 add_sat_s_i8x16(i8x16 x, i8x16 y) {
   return __builtin_wasm_add_sat_s_i8x16(x, y);
+  // MISSING-SIMD: error: '__builtin_wasm_add_sat_s_i8x16' needs target feature simd128
   // WEBASSEMBLY: call <16 x i8> @llvm.sadd.sat.v16i8(
   // WEBASSEMBLY-SAME: <16 x i8> %x, <16 x i8> %y)
   // WEBASSEMBLY-NEXT: ret
Index: clang/lib/Headers/wasm_simd128.h
===
--- clang/lib/Headers/wasm_simd128.h
+++ clang/lib/Headers/wasm_simd128.h
@@ -396,67 +396,126 @@
__a, __a, __a, __a, __a, __a, __a, __a};
 }
 
-#define wasm_i8x16_extract_lane(__a, __i)

[PATCH] D106400: [WebAssembly] Codegen for v128.load{32,64}_zero

2021-07-21 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1a57ee1276ed: [WebAssembly] Codegen for 
v128.load{32,64}_zero (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106400/new/

https://reviews.llvm.org/D106400

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
  llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll

Index: llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
+++ llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
@@ -5,9 +5,6 @@
 
 target triple = "wasm32-unknown-unknown"
 
-declare <4 x i32> @llvm.wasm.load32.zero(i32*)
-declare <2 x i64> @llvm.wasm.load64.zero(i64*)
-
 ;===
 ; v128.load32_zero
 ;===
@@ -17,9 +14,10 @@
 ; CHECK: .functype load_zero_i32_no_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
-  %v = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %p)
+  %x = load i32, i32* %p
+  %v = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %v
 }
 
@@ -28,12 +26,13 @@
 ; CHECK: .functype load_zero_i32_with_folded_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 24:p2align=0
+; CHECK-NEXT:v128.load32_zero 24
 ; CHECK-NEXT:# fallthrough-return
   %q = ptrtoint i32* %p to i32
   %r = add nuw i32 %q, 24
   %s = inttoptr i32 %r to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -42,10 +41,11 @@
 ; CHECK: .functype load_zero_i32_with_folded_gep_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 24:p2align=0
+; CHECK-NEXT:v128.load32_zero 24
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr inbounds i32, i32* %p, i32 6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -56,10 +56,11 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const -24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr inbounds i32, i32* %p, i32 -6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -70,12 +71,13 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const 24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %q = ptrtoint i32* %p to i32
   %r = add nsw i32 %q, 24
   %s = inttoptr i32 %r to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -86,10 +88,11 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const 24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr i32, i32* %p, i32 6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -98,10 +101,11 @@
 ; CHECK: .functype load_zero_i32_from_numeric_address () -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:i32.const 0
-; CHECK-NEXT:v128.load32_zero 42:p2align=0
+; CHECK-NEXT:v128.load32_zero 42
 ; CHECK-NEXT:# fallthrough-return
   %s = inttoptr i32 42 to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -111,9 +115,10 @@
 ; CHECK: .functype load_zero_i32_from_global_address () -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:i32.const 0
-; CHECK-NEXT:v128.load32_zero gv_i32:p2align=0
+; CHECK-NEXT:v128.load32_zero gv_i32
 ;

[PATCH] D106400: [WebAssembly] Codegen for v128.load{32,64}_zero

2021-07-21 Thread Thomas Lively via Phabricator via cfe-commits

tlively updated this revision to Diff 360485.
tlively added a comment.

- Add aligment tests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106400/new/

https://reviews.llvm.org/D106400

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
  llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll

Index: llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
+++ llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
@@ -5,9 +5,6 @@
 
 target triple = "wasm32-unknown-unknown"
 
-declare <4 x i32> @llvm.wasm.load32.zero(i32*)
-declare <2 x i64> @llvm.wasm.load64.zero(i64*)
-
 ;===
 ; v128.load32_zero
 ;===
@@ -17,9 +14,10 @@
 ; CHECK: .functype load_zero_i32_no_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
-  %v = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %p)
+  %x = load i32, i32* %p
+  %v = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %v
 }
 
@@ -28,12 +26,13 @@
 ; CHECK: .functype load_zero_i32_with_folded_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 24:p2align=0
+; CHECK-NEXT:v128.load32_zero 24
 ; CHECK-NEXT:# fallthrough-return
   %q = ptrtoint i32* %p to i32
   %r = add nuw i32 %q, 24
   %s = inttoptr i32 %r to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -42,10 +41,11 @@
 ; CHECK: .functype load_zero_i32_with_folded_gep_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 24:p2align=0
+; CHECK-NEXT:v128.load32_zero 24
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr inbounds i32, i32* %p, i32 6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -56,10 +56,11 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const -24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr inbounds i32, i32* %p, i32 -6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -70,12 +71,13 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const 24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %q = ptrtoint i32* %p to i32
   %r = add nsw i32 %q, 24
   %s = inttoptr i32 %r to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -86,10 +88,11 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const 24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr i32, i32* %p, i32 6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -98,10 +101,11 @@
 ; CHECK: .functype load_zero_i32_from_numeric_address () -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:i32.const 0
-; CHECK-NEXT:v128.load32_zero 42:p2align=0
+; CHECK-NEXT:v128.load32_zero 42
 ; CHECK-NEXT:# fallthrough-return
   %s = inttoptr i32 42 to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -111,9 +115,10 @@
 ; CHECK: .functype load_zero_i32_from_global_address () -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:i32.const 0
-; CHECK-NEXT:v128.load32_zero gv_i32:p2align=0
+; CHECK-NEXT:v128.load32_zero gv_i32
 ; CHECK-NEXT:# fallthrough-return
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* @gv_i32)
+  %x = load i32, i32* @gv_i32
+  %t =

[PATCH] D106400: [WebAssembly] Codegen for v128.load{32,64}_zero

2021-07-20 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal instruction selection patterns. The wasm_simd128.h
intrinsics header was already using portable code for the corresponding
intrinsics, so now it produces the correct instructions.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106400

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/builtins-wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll

Index: llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
+++ llvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
@@ -5,9 +5,6 @@
 
 target triple = "wasm32-unknown-unknown"
 
-declare <4 x i32> @llvm.wasm.load32.zero(i32*)
-declare <2 x i64> @llvm.wasm.load64.zero(i64*)
-
 ;===
 ; v128.load32_zero
 ;===
@@ -17,9 +14,10 @@
 ; CHECK: .functype load_zero_i32_no_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
-  %v = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %p)
+  %x = load i32, i32* %p
+  %v = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %v
 }
 
@@ -28,12 +26,13 @@
 ; CHECK: .functype load_zero_i32_with_folded_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 24:p2align=0
+; CHECK-NEXT:v128.load32_zero 24
 ; CHECK-NEXT:# fallthrough-return
   %q = ptrtoint i32* %p to i32
   %r = add nuw i32 %q, 24
   %s = inttoptr i32 %r to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -42,10 +41,11 @@
 ; CHECK: .functype load_zero_i32_with_folded_gep_offset (i32) -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:local.get 0
-; CHECK-NEXT:v128.load32_zero 24:p2align=0
+; CHECK-NEXT:v128.load32_zero 24
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr inbounds i32, i32* %p, i32 6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -56,10 +56,11 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const -24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr inbounds i32, i32* %p, i32 -6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -70,12 +71,13 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const 24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %q = ptrtoint i32* %p to i32
   %r = add nsw i32 %q, 24
   %s = inttoptr i32 %r to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -86,10 +88,11 @@
 ; CHECK-NEXT:local.get 0
 ; CHECK-NEXT:i32.const 24
 ; CHECK-NEXT:i32.add
-; CHECK-NEXT:v128.load32_zero 0:p2align=0
+; CHECK-NEXT:v128.load32_zero 0
 ; CHECK-NEXT:# fallthrough-return
   %s = getelementptr i32, i32* %p, i32 6
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -98,10 +101,11 @@
 ; CHECK: .functype load_zero_i32_from_numeric_address () -> (v128)
 ; CHECK-NEXT:  # %bb.0:
 ; CHECK-NEXT:i32.const 0
-; CHECK-NEXT:v128.load32_zero 42:p2align=0
+; CHECK-NEXT:v128.load32_zero 42
 ; CHECK-NEXT:# fallthrough-return
   %s = inttoptr i32 42 to i32*
-  %t = tail call <4 x i32> @llvm.wasm.load32.zero(i32* %s)
+  %x = load i32, i32* %s
+  %t = insertelement <4 x i32> zeroinitializer, i32 %x, i32 0
   ret <4 x i32> %t
 }
 
@@ -111,9 +115,10 @@
 ; CHECK:

[PATCH] D104797: [WebAssembly] Implementation of global.get/set for reftypes in LLVM IR

2021-07-19 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

In D104797#2879855 , @pmatos wrote:

> @tlively once D105423  lands, is it enough 
> to test and reland it under this revision or shall i open a new one?

You can just reland it under this revision.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104797/new/

https://reviews.llvm.org/D104797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106019: [WebAssembly] Codegen for v128.storeX_lane instructions

2021-07-14 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG4a4229f70f81: [WebAssembly] Codegen for v128.storeX_lane 
instructions (authored by tlively).

Changed prior to commit:
  https://reviews.llvm.org/D106019?vs=358747=358774#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106019/new/

https://reviews.llvm.org/D106019

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-build-pair.ll
  llvm/test/CodeGen/WebAssembly/simd-load-lane-offset.ll
  llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll

Index: llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
+++ llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
@@ -161,6 +161,34 @@
   ret <16 x i8> %v1
 }
 
+; 1 is the default alignment for v128.store8_lane so no attribute is needed.
+define void @store_lane_i8_a1(<16 x i8> %v, i8* %p) {
+; CHECK-LABEL: store_lane_i8_a1:
+; CHECK: .functype store_lane_i8_a1 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <16 x i8> %v, i32 0
+  store i8 %x, i8* %p, align 1
+  ret void
+}
+
+; 2 is greater than the default alignment so it is ignored.
+define void @store_lane_i8_a2(<16 x i8> %v, i8* %p) {
+; CHECK-LABEL: store_lane_i8_a2:
+; CHECK: .functype store_lane_i8_a2 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <16 x i8> %v, i32 0
+  store i8 %x, i8* %p, align 2
+  ret void
+}
+
 ; ==
 ; 8 x i16
 ; ==
@@ -462,6 +490,47 @@
   ret <8 x i16> %v1
 }
 
+define void @store_lane_i16_a1(<8 x i16> %v, i16* %p) {
+; CHECK-LABEL: store_lane_i16_a1:
+; CHECK: .functype store_lane_i16_a1 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store16_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <8 x i16> %v, i32 0
+  store i16 %x, i16* %p, align 1
+  ret void
+}
+
+; 2 is the default alignment for v128.store16_lane so no attribute is needed.
+define void @store_lane_i16_a2(<8 x i16> %v, i16* %p) {
+; CHECK-LABEL: store_lane_i16_a2:
+; CHECK: .functype store_lane_i16_a2 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <8 x i16> %v, i32 0
+  store i16 %x, i16* %p, align 2
+  ret void
+}
+
+; 4 is greater than the default alignment so it is ignored.
+define void @store_lane_i16_a4(<8 x i16> %v, i16* %p) {
+; CHECK-LABEL: store_lane_i16_a4:
+; CHECK: .functype store_lane_i16_a4 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <8 x i16> %v, i32 0
+  store i16 %x, i16* %p, align 4
+  ret void
+}
+
 ; ==
 ; 4 x i32
 ; ==
@@ -789,6 +858,60 @@
   ret <4 x i32> %v1
 }
 
+define void @store_lane_i32_a1(<4 x i32> %v, i32* %p) {
+; CHECK-LABEL: store_lane_i32_a1:
+; CHECK: .functype store_lane_i32_a1 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store32_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <4 x i32> %v, i32 0
+  store i32 %x, i32* %p, align 1
+  ret void
+}
+
+define void @store_lane_i32_a2(<4 x i32> %v, i32* %p) {
+; CHECK-LABEL: store_lane_i32_a2:
+; CHECK: .functype store_lane_i32_a2 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store32_lane 0:p2align=1, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <4 x i32> %v, i32 0
+  store i32 %x, i32* %p, align 2
+  ret void
+}
+
+; 4 is the default alignment for v128.store32_lane so no

[PATCH] D106019: [WebAssembly] Codegen for v128.storeX_lane instructions

2021-07-14 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50435.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106019

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-build-pair.ll
  llvm/test/CodeGen/WebAssembly/simd-load-lane-offset.ll
  llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll

Index: llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
+++ llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
@@ -162,6 +162,34 @@
   ret <16 x i8> %v1
 }
 
+; 1 is the default alignment for v128.store8_lane so no attribute is needed.
+define void @store_lane_i8_a1(<16 x i8> %v, i8* %p) {
+; CHECK-LABEL: store_lane_i8_a1:
+; CHECK: .functype store_lane_i8_a1 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <16 x i8> %v, i32 0
+  store i8 %x, i8* %p, align 1
+  ret void
+}
+
+; 2 is greater than the default alignment so it is ignored.
+define void @store_lane_i8_a2(<16 x i8> %v, i8* %p) {
+; CHECK-LABEL: store_lane_i8_a2:
+; CHECK: .functype store_lane_i8_a2 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <16 x i8> %v, i32 0
+  store i8 %x, i8* %p, align 2
+  ret void
+}
+
 ; ==
 ; 8 x i16
 ; ==
@@ -463,6 +491,47 @@
   ret <8 x i16> %v1
 }
 
+define void @store_lane_i16_a1(<8 x i16> %v, i16* %p) {
+; CHECK-LABEL: store_lane_i16_a1:
+; CHECK: .functype store_lane_i16_a1 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store16_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <8 x i16> %v, i32 0
+  store i16 %x, i16* %p, align 1
+  ret void
+}
+
+; 2 is the default alignment for v128.store16_lane so no attribute is needed.
+define void @store_lane_i16_a2(<8 x i16> %v, i16* %p) {
+; CHECK-LABEL: store_lane_i16_a2:
+; CHECK: .functype store_lane_i16_a2 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <8 x i16> %v, i32 0
+  store i16 %x, i16* %p, align 2
+  ret void
+}
+
+; 4 is greater than the default alignment so it is ignored.
+define void @store_lane_i16_a4(<8 x i16> %v, i16* %p) {
+; CHECK-LABEL: store_lane_i16_a4:
+; CHECK: .functype store_lane_i16_a4 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <8 x i16> %v, i32 0
+  store i16 %x, i16* %p, align 4
+  ret void
+}
+
 ; ==
 ; 4 x i32
 ; ==
@@ -790,6 +859,60 @@
   ret <4 x i32> %v1
 }
 
+define void @store_lane_i32_a1(<4 x i32> %v, i32* %p) {
+; CHECK-LABEL: store_lane_i32_a1:
+; CHECK: .functype store_lane_i32_a1 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store32_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <4 x i32> %v, i32 0
+  store i32 %x, i32* %p, align 1
+  ret void
+}
+
+define void @store_lane_i32_a2(<4 x i32> %v, i32* %p) {
+; CHECK-LABEL: store_lane_i32_a2:
+; CHECK: .functype store_lane_i32_a2 (v128, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:v128.store32_lane 0:p2align=1, 0
+; CHECK-NEXT:# fallthrough-return
+  %x = extractelement <4 x i32> %v, i32 0
+  store i32 %x, i32* %p, align 2
+  ret void
+}
+
+; 4 is the default

[PATCH] D105950: [WebAssembly] Codegen for v128.loadX_lane instructions

2021-07-14 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG970e0900104d: [WebAssembly] Codegen for v128.loadX_lane 
instructions (authored by tlively).

Changed prior to commit:
  https://reviews.llvm.org/D105950?vs=358668=358676#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105950/new/

https://reviews.llvm.org/D105950

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-build-vector.ll
  llvm/test/CodeGen/WebAssembly/simd-load-lane-offset.ll
  llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll

Index: llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
+++ llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
@@ -133,6 +133,34 @@
   ret <16 x i8> %v2
 }
 
+; 1 is the default alignment for v128.load8_lane so no attribute is needed.
+define <16 x i8> @load_lane_i8_a1(i8* %p, <16 x i8> %v) {
+; CHECK-LABEL: load_lane_i8_a1:
+; CHECK: .functype load_lane_i8_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i8, i8* %p, align 1
+  %v1 = insertelement <16 x i8> %v, i8 %e, i32 0
+  ret <16 x i8> %v1
+}
+
+; 2 is greater than the default alignment so it is ignored.
+define <16 x i8> @load_lane_i8_a2(i8* %p, <16 x i8> %v) {
+; CHECK-LABEL: load_lane_i8_a2:
+; CHECK: .functype load_lane_i8_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i8, i8* %p, align 2
+  %v1 = insertelement <16 x i8> %v, i8 %e, i32 0
+  ret <16 x i8> %v1
+}
+
 ; ==
 ; 8 x i16
 ; ==
@@ -393,6 +421,47 @@
   ret <8 x i16> %v2
 }
 
+define <8 x i16> @load_lane_i16_a1(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a1:
+; CHECK: .functype load_lane_i16_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 1
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
+; 2 is the default alignment for v128.load16_lane so no attribute is needed.
+define <8 x i16> @load_lane_i16_a2(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a2:
+; CHECK: .functype load_lane_i16_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 2
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
+; 4 is greater than the default alignment so it is ignored.
+define <8 x i16> @load_lane_i16_a4(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a4:
+; CHECK: .functype load_lane_i16_a4 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 4
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
 ; ==
 ; 4 x i32
 ; ==
@@ -666,6 +735,60 @@
   ret <4 x i32> %v2
 }
 
+define <4 x i32> @load_lane_i32_a1(i32* %p, <4 x i32> %v) {
+; CHECK-LABEL: load_lane_i32_a1:
+; CHECK: .functype load_lane_i32_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load32_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i32, i32* %p, align 1
+  %v1 = insertelement <4 x i32> %v, i32 %e, i32 0
+  ret <4 x i32> %v1
+}
+
+define <4 x i32> @load_lane_i32_a2(i32* %p, <4 x i32> %v) {
+; CHECK-LABEL: load_lane_i32_a2:
+; CHECK: .functype load_lane_i32_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load32_lane 0:p2align=1, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load

[PATCH] D105950: [WebAssembly] Codegen for v128.loadX_lane instructions

2021-07-14 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td:324
+  PatFrag<(ops node:$ptr, node:$vec, node:$idx),
+  (vector_insert $vec, (i32 (extloadi8 $ptr)), $idx)>;
+def load16_lane :

aheejin wrote:
> Why are i8 and i16 are extended-loaded?
For i8x16 and i16x8 vectors, loading a lane  from memory means loading just the 
i8 or i16. But after selection DAG legalization, the result of those loads are 
legalized to be i32, making these extending loads. If this were a DAG combine 
rather than an ISel pattern, I would use the pre-legalization i8 and i16 with 
non-extending loads.



Comment at: llvm/test/CodeGen/WebAssembly/simd-build-vector.ll:214
 ; CHECK:   v128.const  $push[[L0:[0-9]+]]=, 0, 0, 0, 0, 42, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 42, 0
-; CHECK:   i8x16.replace_lane
+; CHECK:   v128.load8_lane
 ; CHECK:   i8x16.replace_lane

aheejin wrote:
> Why the change?
The lane for the swizzle comes from a load from the stack, so that now gets 
selected to v128.load8_lane rather than a load followed by a replace_lane.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105950/new/

https://reviews.llvm.org/D105950

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105950: [WebAssembly] Codegen for v128.loadX_lane instructions

2021-07-14 Thread Thomas Lively via Phabricator via cfe-commits

tlively updated this revision to Diff 358668.
tlively marked 2 inline comments as done.
tlively added a comment.

- Address comments


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105950/new/

https://reviews.llvm.org/D105950

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-build-vector.ll
  llvm/test/CodeGen/WebAssembly/simd-load-lane-offset.ll
  llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll

Index: llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
+++ llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
@@ -134,6 +134,34 @@
   ret <16 x i8> %v2
 }
 
+; 1 is the default alignment for v128.load8_lane so no attribute is needed.
+define <16 x i8> @load_lane_i8_a1(i8* %p, <16 x i8> %v) {
+; CHECK-LABEL: load_lane_i8_a1:
+; CHECK: .functype load_lane_i8_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i8, i8* %p, align 1
+  %v1 = insertelement <16 x i8> %v, i8 %e, i32 0
+  ret <16 x i8> %v1
+}
+
+; 2 is greater than the default alignment so it is ignored.
+define <16 x i8> @load_lane_i8_a2(i8* %p, <16 x i8> %v) {
+; CHECK-LABEL: load_lane_i8_a2:
+; CHECK: .functype load_lane_i8_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i8, i8* %p, align 2
+  %v1 = insertelement <16 x i8> %v, i8 %e, i32 0
+  ret <16 x i8> %v1
+}
+
 ; ==
 ; 8 x i16
 ; ==
@@ -394,6 +422,47 @@
   ret <8 x i16> %v2
 }
 
+define <8 x i16> @load_lane_i16_a1(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a1:
+; CHECK: .functype load_lane_i16_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 1
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
+; 2 is the default alignment for v128.load16_lane so no attribute is needed.
+define <8 x i16> @load_lane_i16_a2(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a2:
+; CHECK: .functype load_lane_i16_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 2
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
+; 4 is greater than the default alignment so it is ignored.
+define <8 x i16> @load_lane_i16_a4(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a4:
+; CHECK: .functype load_lane_i16_a4 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 4
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
 ; ==
 ; 4 x i32
 ; ==
@@ -667,6 +736,60 @@
   ret <4 x i32> %v2
 }
 
+define <4 x i32> @load_lane_i32_a1(i32* %p, <4 x i32> %v) {
+; CHECK-LABEL: load_lane_i32_a1:
+; CHECK: .functype load_lane_i32_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load32_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i32, i32* %p, align 1
+  %v1 = insertelement <4 x i32> %v, i32 %e, i32 0
+  ret <4 x i32> %v1
+}
+
+define <4 x i32> @load_lane_i32_a2(i32* %p, <4 x i32> %v) {
+; CHECK-LABEL: load_lane_i32_a2:
+; CHECK: .functype load_lane_i32_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load32_lane 0:p2align=1, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i32, i32* %p, align 2
+  %v1 = insertelement <4 x i32> %v, i32 %e, i32 0
+  ret <4 x i32> %v1
+}
+
+; 4 is the default alignment for v128.load32_lane so no attribute is needed.
+define <4 x

[PATCH] D105950: [WebAssembly] Codegen for v128.loadX_lane instructions

2021-07-13 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Replace the experimental clang builtin and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50433.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D105950

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-build-vector.ll
  llvm/test/CodeGen/WebAssembly/simd-load-lane-offset.ll
  llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll

Index: llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
+++ llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
@@ -134,6 +134,34 @@
   ret <16 x i8> %v2
 }
 
+; 1 is the default alignment for v128.load8_lane so no attribute is needed.
+define <16 x i8> @load_lane_i8_a1(i8* %p, <16 x i8> %v) {
+; CHECK-LABEL: load_lane_i8_a1:
+; CHECK: .functype load_lane_i8_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i8, i8* %p, align 1
+  %v1 = insertelement <16 x i8> %v, i8 %e, i32 0
+  ret <16 x i8> %v1
+}
+
+; 2 is greater than the default alignment so it is ignored.
+define <16 x i8> @load_lane_i8_a2(i8* %p, <16 x i8> %v) {
+; CHECK-LABEL: load_lane_i8_a2:
+; CHECK: .functype load_lane_i8_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load8_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i8, i8* %p, align 2
+  %v1 = insertelement <16 x i8> %v, i8 %e, i32 0
+  ret <16 x i8> %v1
+}
+
 ; ==
 ; 8 x i16
 ; ==
@@ -394,6 +422,47 @@
   ret <8 x i16> %v2
 }
 
+define <8 x i16> @load_lane_i16_a1(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a1:
+; CHECK: .functype load_lane_i16_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 1
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
+; 2 is the default alignment for v128.load16_lane so no attribute is needed.
+define <8 x i16> @load_lane_i16_a2(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a2:
+; CHECK: .functype load_lane_i16_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 2
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
+; 4 is greater than the default alignment so it is ignored.
+define <8 x i16> @load_lane_i16_a4(i16* %p, <8 x i16> %v) {
+; CHECK-LABEL: load_lane_i16_a4:
+; CHECK: .functype load_lane_i16_a4 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load16_lane 0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i16, i16* %p, align 4
+  %v1 = insertelement <8 x i16> %v, i16 %e, i32 0
+  ret <8 x i16> %v1
+}
+
 ; ==
 ; 4 x i32
 ; ==
@@ -667,6 +736,60 @@
   ret <4 x i32> %v2
 }
 
+define <4 x i32> @load_lane_i32_a1(i32* %p, <4 x i32> %v) {
+; CHECK-LABEL: load_lane_i32_a1:
+; CHECK: .functype load_lane_i32_a1 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load32_lane 0:p2align=0, 0
+; CHECK-NEXT:# fallthrough-return
+  %e = load i32, i32* %p, align 1
+  %v1 = insertelement <4 x i32> %v, i32 %e, i32 0
+  ret <4 x i32> %v1
+}
+
+define <4 x i32> @load_lane_i32_a2(i32* %p, <4 x i32> %v) {
+; CHECK-LABEL: load_lane_i32_a2:
+; CHECK: .functype load_lane_i32_a2 (i32, v128) -> (v128)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:local.get 0
+; CHECK-NEXT:local.get 1
+; CHECK-NEXT:v128.load32_lane 0:p2align=1, 0
+;

[PATCH] D105755: [WebAssembly] Custom combines for f32x4.demote_zero_f64x2

2021-07-12 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:192-193
 
 TARGET_BUILTIN(__builtin_wasm_trunc_sat_zero_s_f64x2_i32x4, "V4iV2d", "nc", 
"simd128")
 TARGET_BUILTIN(__builtin_wasm_trunc_sat_zero_u_f64x2_i32x4, "V4UiV2d", "nc", 
"simd128")
-TARGET_BUILTIN(__builtin_wasm_demote_zero_f64x2_f32x4, "V4fV2d", "nc", 
"simd128")

tlively wrote:
> aheejin wrote:
> > If these share the same pattern, do we need these builtins?
> Hmm, good point. Maybe not. I'll investigate that in a follow up.
Ah, the reason we can't remove these builtins is because there is no good 
target-independent way to represent the saturating behavior of the float-to-int 
conversion.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105755/new/

https://reviews.llvm.org/D105755

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105755: [WebAssembly] Custom combines for f32x4.demote_zero_f64x2

2021-07-12 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGcbabfc63b1be: [WebAssembly] Custom combines for 
f32x4.demote_zero_f64x2 (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105755/new/

https://reviews.llvm.org/D105755

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISD.def
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-conversions.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s --check-prefixes=CHECK,SLOW
 ; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 -fast-isel | FileCheck %s
 
 ; Test that SIMD128 intrinsics lower as expected. These intrinsics are
@@ -542,6 +542,18 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: trunc_sat_zero_s_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_zero_s_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_s $push[[R:[0-9]+]]=, $0{{$}}
+; SLOW-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double>)
+define <4 x i32> @trunc_sat_zero_s_v4i32_2(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+   <4 x i32> 
+  %a = call <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double> %v)
+  ret <4 x i32> %a
+}
+
 ; CHECK-LABEL: trunc_sat_zero_u_v4i32:
 ; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32 (v128) -> (v128){{$}}
 ; CHECK-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
@@ -554,6 +566,18 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: trunc_sat_zero_u_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
+; SLOW-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.fptoui.sat.v4i32.v4f64(<4 x double>)
+define <4 x i32> @trunc_sat_zero_u_v4i32_2(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+   <4 x i32> 
+  %a = call <4 x i32> @llvm.fptoui.sat.v4i32.v4f64(<4 x double> %v)
+  ret <4 x i32> %a
+}
+
 ; ==
 ; 2 x i64
 ; ==
@@ -722,16 +746,6 @@
   ret <4 x float> %v
 }
 
-; CHECK-LABEL: demote_zero_v4f32:
-; CHECK-NEXT: .functype demote_zero_v4f32 (v128) -> (v128){{$}}
-; CHECK-NEXT: f32x4.demote_zero_f64x2 $push[[R:[0-9]+]]=, $0{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x float> @llvm.wasm.demote.zero(<2 x double>)
-define <4 x float> @demote_zero_v4f32(<2 x double> %a) {
-  %v = call <4 x float> @llvm.wasm.demote.zero(<2 x double> %a)
-  ret <4 x float> %v
-}
-
 ; ==
 ; 2 x f64
 ; ==
Index: llvm/test/CodeGen/WebAssembly/simd-conversions.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-conversions.ll
+++ llvm/test/CodeGen/WebAssembly/simd-conversions.ll
@@ -82,6 +82,30 @@
   ret <2 x i64> %a
 }
 
+; CHECK-LABEL: demote_zero_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype demote_zero_v4f32 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.demote_zero_f64x2 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <4 x float> @demote_zero_v4f32(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+ <4 x i32> 
+  %a = fptrunc <4 x double> %v to <4 x float>
+  ret <4 x float> %a
+}
+
+; CHECK-LABEL: demote_zero_v4f32_2:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype demote_zero_v4f32_2 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.demote_zero_f64x2 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <4 x float> @demote_zero_v4f32_2(<2 x double> %x) {
+  %v = fptrunc <2 x double> %x to <2 x float>
+  %a =

[PATCH] D105755: [WebAssembly] Custom combines for f32x4.demote_zero_f64x2

2021-07-12 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:192-193
 
 TARGET_BUILTIN(__builtin_wasm_trunc_sat_zero_s_f64x2_i32x4, "V4iV2d", "nc", 
"simd128")
 TARGET_BUILTIN(__builtin_wasm_trunc_sat_zero_u_f64x2_i32x4, "V4UiV2d", "nc", 
"simd128")
-TARGET_BUILTIN(__builtin_wasm_demote_zero_f64x2_f32x4, "V4fV2d", "nc", 
"simd128")

aheejin wrote:
> If these share the same pattern, do we need these builtins?
Hmm, good point. Maybe not. I'll investigate that in a follow up.



Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:2367
+
+if (!IsZeroSplat(N->getOperand(1)))
+  return SDValue();

aheejin wrote:
> Do we not need to check if `N->getOpernad(1)` is a specific type, such as 
> `v2f32` or `v2i32`?
I think that the concat_vectors node ensures that the two sides have the same 
type, but I'll explicitly check to be safe.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105755/new/

https://reviews.llvm.org/D105755

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105755: [WebAssembly] Custom combines for f32x4.demote_zero_f64x2

2021-07-12 Thread Thomas Lively via Phabricator via cfe-commits

tlively updated this revision to Diff 357983.
tlively added a comment.

- Check types of splats as well


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105755/new/

https://reviews.llvm.org/D105755

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISD.def
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-conversions.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s --check-prefixes=CHECK,SLOW
 ; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 -fast-isel | FileCheck %s
 
 ; Test that SIMD128 intrinsics lower as expected. These intrinsics are
@@ -542,6 +542,18 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: trunc_sat_zero_s_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_zero_s_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_s $push[[R:[0-9]+]]=, $0{{$}}
+; SLOW-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double>)
+define <4 x i32> @trunc_sat_zero_s_v4i32_2(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+   <4 x i32> 
+  %a = call <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double> %v)
+  ret <4 x i32> %a
+}
+
 ; CHECK-LABEL: trunc_sat_zero_u_v4i32:
 ; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32 (v128) -> (v128){{$}}
 ; CHECK-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
@@ -554,6 +566,18 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: trunc_sat_zero_u_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
+; SLOW-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.fptoui.sat.v4i32.v4f64(<4 x double>)
+define <4 x i32> @trunc_sat_zero_u_v4i32_2(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+   <4 x i32> 
+  %a = call <4 x i32> @llvm.fptoui.sat.v4i32.v4f64(<4 x double> %v)
+  ret <4 x i32> %a
+}
+
 ; ==
 ; 2 x i64
 ; ==
@@ -722,16 +746,6 @@
   ret <4 x float> %v
 }
 
-; CHECK-LABEL: demote_zero_v4f32:
-; CHECK-NEXT: .functype demote_zero_v4f32 (v128) -> (v128){{$}}
-; CHECK-NEXT: f32x4.demote_zero_f64x2 $push[[R:[0-9]+]]=, $0{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x float> @llvm.wasm.demote.zero(<2 x double>)
-define <4 x float> @demote_zero_v4f32(<2 x double> %a) {
-  %v = call <4 x float> @llvm.wasm.demote.zero(<2 x double> %a)
-  ret <4 x float> %v
-}
-
 ; ==
 ; 2 x f64
 ; ==
Index: llvm/test/CodeGen/WebAssembly/simd-conversions.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-conversions.ll
+++ llvm/test/CodeGen/WebAssembly/simd-conversions.ll
@@ -82,6 +82,30 @@
   ret <2 x i64> %a
 }
 
+; CHECK-LABEL: demote_zero_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype demote_zero_v4f32 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.demote_zero_f64x2 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <4 x float> @demote_zero_v4f32(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+ <4 x i32> 
+  %a = fptrunc <4 x double> %v to <4 x float>
+  ret <4 x float> %a
+}
+
+; CHECK-LABEL: demote_zero_v4f32_2:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype demote_zero_v4f32_2 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.demote_zero_f64x2 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <4 x float> @demote_zero_v4f32_2(<2 x double> %x) {
+  %v = fptrunc <2 x double> %x to <2 x float>
+  %a = shufflevector <2 x float> %v, <2 x float> zeroinitializer,
+ <4 x i32> 
+  ret <4 x float> %a
+}
+
 ; CHECK-LABEL:

[PATCH] D105675: [WebAssembly] Custom combines for f64x2.promote_low_f32x4

2021-07-09 Thread Thomas Lively via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe5220104d070: [WebAssembly] Custom combines for 
f64x2.promote_low_f32x4 (authored by tlively).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105675/new/

https://reviews.llvm.org/D105675

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISD.def
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-conversions.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -806,13 +806,3 @@
   %v = call <2 x double> @llvm.nearbyint.v2f64(<2 x double> %a)
   ret <2 x double> %v
 }
-
-; CHECK-LABEL: promote_low_v2f64:
-; CHECK-NEXT: .functype promote_low_v2f64 (v128) -> (v128){{$}}
-; CHECK-NEXT: f64x2.promote_low_f32x4 $push[[R:[0-9]+]]=, $0{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <2 x double> @llvm.wasm.promote.low(<4 x float>)
-define <2 x double> @promote_low_v2f64(<4 x float> %a) {
-  %v = call <2 x double> @llvm.wasm.promote.low(<4 x float> %a)
-  ret <2 x double> %v
-}
Index: llvm/test/CodeGen/WebAssembly/simd-conversions.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-conversions.ll
+++ llvm/test/CodeGen/WebAssembly/simd-conversions.ll
@@ -126,3 +126,25 @@
   %a = shufflevector <4 x double> %v, <4 x double> undef, <2 x i32> 
   ret <2 x double> %a
 }
+
+; CHECK-LABEL: promote_low_v2f64:
+; NO-SIMD128-NOT: f64x2
+; SIMD128-NEXT: .functype promote_low_v2f64 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f64x2.promote_low_f32x4 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <2 x double> @promote_low_v2f64(<4 x float> %x) {
+  %v = shufflevector <4 x float> %x, <4 x float> undef, <2 x i32> 
+  %a = fpext <2 x float> %v to <2 x double>
+  ret <2 x double> %a
+}
+
+; CHECK-LABEL: promote_low_v2f64_2:
+; NO-SIMD128-NOT: f64x2
+; SIMD128-NEXT: .functype promote_low_v2f64_2 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f64x2.promote_low_f32x4 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <2 x double> @promote_low_v2f64_2(<4 x float> %x) {
+  %v = fpext <4 x float> %x to <4 x double>
+  %a = shufflevector <4 x double> %v, <4 x double> undef, <2 x i32> 
+  ret <2 x double> %a
+}
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1288,11 +1288,13 @@
 defm "" : SIMDConvert;
 
-// Prototype f64x2 conversions
+// f64x2 <-> f32x4 conversions
 defm "" : SIMDConvert;
-defm "" : SIMDConvert;
+
+def promote_t : SDTypeProfile<1, 1, [SDTCisVec<0>, SDTCisVec<1>]>;
+def promote_low : SDNode<"WebAssemblyISD::PROMOTE_LOW", promote_t>;
+defm "" : SIMDConvert;
 
 //===--===//
 // Saturating Rounding Q-Format Multiplication
Index: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
===
--- llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -149,9 +149,11 @@
 setTargetDAGCombine(ISD::SIGN_EXTEND);
 setTargetDAGCombine(ISD::ZERO_EXTEND);
 
-// Combine int_to_fp of extract_vectors and vice versa into conversions ops
+// Combine int_to_fp or fp_extend of extract_vectors and vice versa into
+// conversions ops
 setTargetDAGCombine(ISD::SINT_TO_FP);
 setTargetDAGCombine(ISD::UINT_TO_FP);
+setTargetDAGCombine(ISD::FP_EXTEND);
 setTargetDAGCombine(ISD::EXTRACT_SUBVECTOR);
 
 // Combine concat of {s,u}int_to_fp_sat to i32x4.trunc_sat_f64x2_zero_{s,u}
@@ -2186,60 +2188,109 @@
   if (ResVT != MVT::v2f64)
 return SDValue();
 
-  if (N->getOpcode() == ISD::SINT_TO_FP || N->getOpcode() == ISD::UINT_TO_FP) {
-// Combine this:
-//
-//   (v2f64 ({s,u}int_to_fp
-// (v2i32 (extract_subvector (v4i32 $x), 0
-//
-// into (f64x2.convert_low_i32x4_{s,u} $x).
-auto Extract = N->getOperand(0);
-if (Extract.getOpcode() != ISD::EXTRACT_SUBVECTOR)
-  return SDValue();
-if (Extract.getValueType() != MVT::v2i32)
-  return SDValue();
-auto Source = Extract.getOperand(0);
-if (Source.getValueType() != MVT::v4i32)
-  return SDValue();
-

[PATCH] D105755: [WebAssembly] Custom combines for f32x4.demote_zero_f64x2

2021-07-09 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added a reviewer: aheejin.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100, dschuff.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Replace the clang builtin function and LLVM intrinsic for
f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing
combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D105755

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISD.def
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-conversions.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s
+; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 | FileCheck %s --check-prefixes=CHECK,SLOW
 ; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+simd128 -fast-isel | FileCheck %s
 
 ; Test that SIMD128 intrinsics lower as expected. These intrinsics are
@@ -542,6 +542,18 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: trunc_sat_zero_s_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_zero_s_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_s $push[[R:[0-9]+]]=, $0{{$}}
+; SLOW-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double>)
+define <4 x i32> @trunc_sat_zero_s_v4i32_2(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+   <4 x i32> 
+  %a = call <4 x i32> @llvm.fptosi.sat.v4i32.v4f64(<4 x double> %v)
+  ret <4 x i32> %a
+}
+
 ; CHECK-LABEL: trunc_sat_zero_u_v4i32:
 ; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32 (v128) -> (v128){{$}}
 ; CHECK-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
@@ -554,6 +566,18 @@
   ret <4 x i32> %a
 }
 
+; CHECK-LABEL: trunc_sat_zero_u_v4i32_2:
+; CHECK-NEXT: .functype trunc_sat_zero_u_v4i32_2 (v128) -> (v128){{$}}
+; SLOW-NEXT: i32x4.trunc_sat_zero_f64x2_u $push[[R:[0-9]+]]=, $0{{$}}
+; SLOW-NEXT: return $pop[[R]]{{$}}
+declare <4 x i32> @llvm.fptoui.sat.v4i32.v4f64(<4 x double>)
+define <4 x i32> @trunc_sat_zero_u_v4i32_2(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+   <4 x i32> 
+  %a = call <4 x i32> @llvm.fptoui.sat.v4i32.v4f64(<4 x double> %v)
+  ret <4 x i32> %a
+}
+
 ; ==
 ; 2 x i64
 ; ==
@@ -722,16 +746,6 @@
   ret <4 x float> %v
 }
 
-; CHECK-LABEL: demote_zero_v4f32:
-; CHECK-NEXT: .functype demote_zero_v4f32 (v128) -> (v128){{$}}
-; CHECK-NEXT: f32x4.demote_zero_f64x2 $push[[R:[0-9]+]]=, $0{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <4 x float> @llvm.wasm.demote.zero(<2 x double>)
-define <4 x float> @demote_zero_v4f32(<2 x double> %a) {
-  %v = call <4 x float> @llvm.wasm.demote.zero(<2 x double> %a)
-  ret <4 x float> %v
-}
-
 ; ==
 ; 2 x f64
 ; ==
Index: llvm/test/CodeGen/WebAssembly/simd-conversions.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-conversions.ll
+++ llvm/test/CodeGen/WebAssembly/simd-conversions.ll
@@ -82,6 +82,30 @@
   ret <2 x i64> %a
 }
 
+; CHECK-LABEL: demote_zero_v4f32:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype demote_zero_v4f32 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f32x4.demote_zero_f64x2 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <4 x float> @demote_zero_v4f32(<2 x double> %x) {
+  %v = shufflevector <2 x double> %x, <2 x double> zeroinitializer,
+ <4 x i32> 
+  %a = fptrunc <4 x double> %v to <4 x float>
+  ret <4 x float> %a
+}
+
+; CHECK-LABEL: demote_zero_v4f32_2:
+; NO-SIMD128-NOT: f32x4
+; SIMD128-NEXT: .functype demote_zero_v4f32_2 (v128) -> (v128){{$}}
+;

[PATCH] D105675: [WebAssembly] Custom combines for f64x2.promote_low_f32x4

2021-07-08 Thread Thomas Lively via Phabricator via cfe-commits

tlively created this revision.
tlively added reviewers: aheejin, dschuff.
Herald added subscribers: wingo, ecnelises, sunfish, hiraditya, 
jgravelle-google, sbc100.
tlively requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Replace the clang builtin function and LLVM intrinsic previously used to select
the f64x2.promote_low_f32x4 instruction with custom combines from standard
SelectionDAG nodes. Implement the new combines to share code with the similar
combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D105675

Files:
  clang/include/clang/Basic/BuiltinsWebAssembly.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Headers/wasm_simd128.h
  clang/test/CodeGen/builtins-wasm.c
  clang/test/Headers/wasm.c
  llvm/include/llvm/IR/IntrinsicsWebAssembly.td
  llvm/lib/Target/WebAssembly/WebAssemblyISD.def
  llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
  llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
  llvm/test/CodeGen/WebAssembly/simd-conversions.ll
  llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll

Index: llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
+++ llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll
@@ -806,13 +806,3 @@
   %v = call <2 x double> @llvm.nearbyint.v2f64(<2 x double> %a)
   ret <2 x double> %v
 }
-
-; CHECK-LABEL: promote_low_v2f64:
-; CHECK-NEXT: .functype promote_low_v2f64 (v128) -> (v128){{$}}
-; CHECK-NEXT: f64x2.promote_low_f32x4 $push[[R:[0-9]+]]=, $0{{$}}
-; CHECK-NEXT: return $pop[[R]]{{$}}
-declare <2 x double> @llvm.wasm.promote.low(<4 x float>)
-define <2 x double> @promote_low_v2f64(<4 x float> %a) {
-  %v = call <2 x double> @llvm.wasm.promote.low(<4 x float> %a)
-  ret <2 x double> %v
-}
Index: llvm/test/CodeGen/WebAssembly/simd-conversions.ll
===
--- llvm/test/CodeGen/WebAssembly/simd-conversions.ll
+++ llvm/test/CodeGen/WebAssembly/simd-conversions.ll
@@ -126,3 +126,25 @@
   %a = shufflevector <4 x double> %v, <4 x double> undef, <2 x i32> 
   ret <2 x double> %a
 }
+
+; CHECK-LABEL: promote_low_v2f64:
+; NO-SIMD128-NOT: f64x2
+; SIMD128-NEXT: .functype promote_low_v2f64 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f64x2.promote_low_f32x4 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <2 x double> @promote_low_v2f64(<4 x float> %x) {
+  %v = shufflevector <4 x float> %x, <4 x float> undef, <2 x i32> 
+  %a = fpext <2 x float> %v to <2 x double>
+  ret <2 x double> %a
+}
+
+; CHECK-LABEL: promote_low_v2f64_2:
+; NO-SIMD128-NOT: f64x2
+; SIMD128-NEXT: .functype promote_low_v2f64_2 (v128) -> (v128){{$}}
+; SIMD128-NEXT: f64x2.promote_low_f32x4 $push[[R:[0-9]+]]=, $0
+; SIMD128-NEXT: return $pop[[R]]
+define <2 x double> @promote_low_v2f64_2(<4 x float> %x) {
+  %v = fpext <4 x float> %x to <4 x double>
+  %a = shufflevector <4 x double> %v, <4 x double> undef, <2 x i32> 
+  ret <2 x double> %a
+}
Index: llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
===
--- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1288,11 +1288,13 @@
 defm "" : SIMDConvert;
 
-// Prototype f64x2 conversions
+// f64x2 <-> f32x4 conversions
 defm "" : SIMDConvert;
-defm "" : SIMDConvert;
+
+def promote_t : SDTypeProfile<1, 1, [SDTCisVec<0>, SDTCisVec<1>]>;
+def promote_low : SDNode<"WebAssemblyISD::PROMOTE_LOW", promote_t>;
+defm "" : SIMDConvert;
 
 //===--===//
 // Saturating Rounding Q-Format Multiplication
Index: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
===
--- llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -149,9 +149,11 @@
 setTargetDAGCombine(ISD::SIGN_EXTEND);
 setTargetDAGCombine(ISD::ZERO_EXTEND);
 
-// Combine int_to_fp of extract_vectors and vice versa into conversions ops
+// Combine int_to_fp or fp_extend of extract_vectors and vice versa into
+// conversions ops
 setTargetDAGCombine(ISD::SINT_TO_FP);
 setTargetDAGCombine(ISD::UINT_TO_FP);
+setTargetDAGCombine(ISD::FP_EXTEND);
 setTargetDAGCombine(ISD::EXTRACT_SUBVECTOR);
 
 // Combine concat of {s,u}int_to_fp_sat to i32x4.trunc_sat_f64x2_zero_{s,u}
@@ -2186,60 +2188,109 @@
   if (ResVT != MVT::v2f64)
 return SDValue();
 
-  if (N->getOpcode() == ISD::SINT_TO_FP || N->getOpcode() == ISD::UINT_TO_FP) {
-// Combine this:
-//
-//   (v2f64 ({s,u}int_to_fp
-// (v2i32 (extract_subvector (v4i32 $x), 0
-//
-// into (f64x2.convert_low_i32x4_{s,u} $x).
-auto Extract =

[PATCH] D104797: [WebAssembly] Implementation of global.get/set for reftypes in LLVM IR

2021-07-01 Thread Thomas Lively via Phabricator via cfe-commits

tlively accepted this revision.
tlively added a comment.
This revision is now accepted and ready to land.

Thanks!




Comment at: llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp:130
 TT.isArch64Bit()
-? "e-m:e-p:64:64-i64:64-n32:64-S128-ni:1"
-: "e-m:e-p:32:32-i64:64-n32:64-S128-ni:1",
+? (hasReferenceTypes(FS)
+   ? 
"e-m:e-p:64:64-i64:64-n32:64-S128-ni:1:10:20"

pmatos wrote:
> tlively wrote:
> > pmatos wrote:
> > > tlively wrote:
> > > > pmatos wrote:
> > > > > tlively wrote:
> > > > > > `hasReferenceTypes` should also be taking the CPU into account, not 
> > > > > > just the feature string. Normally this would be done via 
> > > > > > `getSubtargetImpl`, but I guess we probably can't call that this 
> > > > > > early in the construction of the `WebAssemblyTargetMachine`. Would 
> > > > > > anything break if we just unconditionally added the reference types 
> > > > > > address spaces to the data layout string?
> > > > > Regarding this change, I don't quite understand why referencetypes 
> > > > > should take the CPU into account. Are there CPU variants for the wasm 
> > > > > backend? I haven't touched the conditional because that would mean 
> > > > > touching the several tests that don't enable reference types and use 
> > > > > the old data layout string. However, I would think that if that's the 
> > > > > path we want to follow here, we could do it and change all wasm tests 
> > > > > to use the layout string with reference types.
> > > > > 
> > > > Yes, there are CPU variants defined here: 
> > > > https://github.com/llvm/llvm-project/blob/7ac0442fe59dbe0f9127e79e8786a7dd6345c537/llvm/lib/Target/WebAssembly/WebAssembly.td#L89-L100.
> > > >  Note that the CPU may enable reference types even if the feature 
> > > > string does not. If it doesn't break anything, then unconditionally 
> > > > updating the layout string sounds like the best option.
> > > Interesting - had not come accross it. Bleeding edge does not seem to 
> > > include reference-types. What's the reason for this? 
> > We don't have a well-defined process for adding features to bleeding-edge, 
> > but I think typically they're added once they're mostly stable and usable 
> > in the tools.
> @tlively I have now unconditionally updated the layout string. No failures. 
> How does this look like now?
Looks good, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104797/new/

https://reviews.llvm.org/D104797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104797: [WebAssembly] Implementation of global.get/set for reftypes in LLVM IR

2021-06-30 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp:130
 TT.isArch64Bit()
-? "e-m:e-p:64:64-i64:64-n32:64-S128-ni:1"
-: "e-m:e-p:32:32-i64:64-n32:64-S128-ni:1",
+? (hasReferenceTypes(FS)
+   ? 
"e-m:e-p:64:64-i64:64-n32:64-S128-ni:1:10:20"

pmatos wrote:
> tlively wrote:
> > pmatos wrote:
> > > tlively wrote:
> > > > `hasReferenceTypes` should also be taking the CPU into account, not 
> > > > just the feature string. Normally this would be done via 
> > > > `getSubtargetImpl`, but I guess we probably can't call that this early 
> > > > in the construction of the `WebAssemblyTargetMachine`. Would anything 
> > > > break if we just unconditionally added the reference types address 
> > > > spaces to the data layout string?
> > > Regarding this change, I don't quite understand why referencetypes should 
> > > take the CPU into account. Are there CPU variants for the wasm backend? I 
> > > haven't touched the conditional because that would mean touching the 
> > > several tests that don't enable reference types and use the old data 
> > > layout string. However, I would think that if that's the path we want to 
> > > follow here, we could do it and change all wasm tests to use the layout 
> > > string with reference types.
> > > 
> > Yes, there are CPU variants defined here: 
> > https://github.com/llvm/llvm-project/blob/7ac0442fe59dbe0f9127e79e8786a7dd6345c537/llvm/lib/Target/WebAssembly/WebAssembly.td#L89-L100.
> >  Note that the CPU may enable reference types even if the feature string 
> > does not. If it doesn't break anything, then unconditionally updating the 
> > layout string sounds like the best option.
> Interesting - had not come accross it. Bleeding edge does not seem to include 
> reference-types. What's the reason for this? 
We don't have a well-defined process for adding features to bleeding-edge, but 
I think typically they're added once they're mostly stable and usable in the 
tools.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104797/new/

https://reviews.llvm.org/D104797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104797: [WebAssembly] Implementation of global.get/set for reftypes in LLVM IR

2021-06-28 Thread Thomas Lively via Phabricator via cfe-commits

tlively added inline comments.



Comment at: llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp:130
 TT.isArch64Bit()
-? "e-m:e-p:64:64-i64:64-n32:64-S128-ni:1"
-: "e-m:e-p:32:32-i64:64-n32:64-S128-ni:1",
+? (hasReferenceTypes(FS)
+   ? 
"e-m:e-p:64:64-i64:64-n32:64-S128-ni:1:10:20"

pmatos wrote:
> tlively wrote:
> > `hasReferenceTypes` should also be taking the CPU into account, not just 
> > the feature string. Normally this would be done via `getSubtargetImpl`, but 
> > I guess we probably can't call that this early in the construction of the 
> > `WebAssemblyTargetMachine`. Would anything break if we just unconditionally 
> > added the reference types address spaces to the data layout string?
> Regarding this change, I don't quite understand why referencetypes should 
> take the CPU into account. Are there CPU variants for the wasm backend? I 
> haven't touched the conditional because that would mean touching the several 
> tests that don't enable reference types and use the old data layout string. 
> However, I would think that if that's the path we want to follow here, we 
> could do it and change all wasm tests to use the layout string with reference 
> types.
> 
Yes, there are CPU variants defined here: 
https://github.com/llvm/llvm-project/blob/7ac0442fe59dbe0f9127e79e8786a7dd6345c537/llvm/lib/Target/WebAssembly/WebAssembly.td#L89-L100.
 Note that the CPU may enable reference types even if the feature string does 
not. If it doesn't break anything, then unconditionally updating the layout 
string sounds like the best option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104797/new/

https://reviews.llvm.org/D104797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104797: [WebAssembly] Implementation of global.get/set for reftypes in LLVM IR

2021-06-24 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a subscriber: aeubanks.
tlively added a comment.

The opaque pointers project is documented here: 
https://llvm.org/docs/OpaquePointers.html. It's been making very slow progress 
for the past few years but has recently been picking up steam under the 
direction of @aeubanks. Search llvm-dev for "opaque pointers" and you will see 
a few recent threads about it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104797/new/

https://reviews.llvm.org/D104797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104797: [WebAssembly] Implementation of global.get/set for reftypes in LLVM IR

2021-06-23 Thread Thomas Lively via Phabricator via cfe-commits

tlively added a comment.

Unfortunately I don't think

In D104797#2836475 , @pmatos wrote:

> @tlively Do you think it would be ok to re-add the code removed in `ac81cb7e` 
> but only error if the pointer is to an **opaque** non-integral type?

Unfortunately that wouldn't be future-proof given the ongoing project to remove 
type information from pointers. I think the best we can do right now is call 
`report_fatal_error` from the backend whenever a reference type is the operand 
or result of ptrtoint or inttoptr.

Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h:69
+if (AS == WasmAddressSpace::EXTERNREF)
+  return MVT::externref;
+else if (AS == WasmAddressSpace::FUNCREF)

It might be worth a comment explaining why the memtype is also externref and 
funcref. Is this just for lack of a more meaningful type to return?

Comment at: llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp:130
 TT.isArch64Bit()
-? "e-m:e-p:64:64-i64:64-n32:64-S128-ni:1"
-: "e-m:e-p:32:32-i64:64-n32:64-S128-ni:1",
+? (hasReferenceTypes(FS)
+   ? 
"e-m:e-p:64:64-i64:64-n32:64-S128-ni:1:10:20"

`hasReferenceTypes` should also be taking the CPU into account, not just the 
feature string. Normally this would be done via `getSubtargetImpl`, but I guess 
we probably can't call that this early in the construction of the 
`WebAssemblyTargetMachine`. Would anything break if we just unconditionally 
added the reference types address spaces to the data layout string?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104797/new/

https://reviews.llvm.org/D104797

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 4 5 >

1 - 100 of 440 matches

Mail list logo