[clang] 05dbdb0 - Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 (2nd try)"

2022-12-08 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2022-12-08T14:16:46-05:00
New Revision: 05dbdb0088a3f5541d9e91c61a564d0aa4704f4f

URL: 
https://github.com/llvm/llvm-project/commit/05dbdb0088a3f5541d9e91c61a564d0aa4704f4f
DIFF: 
https://github.com/llvm/llvm-project/commit/05dbdb0088a3f5541d9e91c61a564d0aa4704f4f.diff

LOG: Revert "[InstCombine] canonicalize trunc + insert as bitcast + shuffle, 
part 1 (2nd try)"

This reverts commit e71b81cab09bf33e3b08ed600418b72cc4117461.

As discussed in the planned follow-on to this patch (D138874),
this and the subsequent patches in this set can cause trouble for
the backend, and there's probably no quick fix. We may even
want to canonicalize in the opposite direction (towards insertelt).

Added: 


Modified: 
clang/test/Headers/wasm.c
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
llvm/test/Transforms/InstCombine/insert-trunc.ll
llvm/test/Transforms/InstCombine/vec_phi_extract-inseltpoison.ll
llvm/test/Transforms/InstCombine/vec_phi_extract.ll
llvm/test/Transforms/LoopVectorize/ARM/pointer_iv.ll
llvm/test/Transforms/PhaseOrdering/X86/vec-load-combine.ll

Removed: 




diff  --git a/clang/test/Headers/wasm.c b/clang/test/Headers/wasm.c
index 79dc67eaa4ef8..53acbf4de4c96 100644
--- a/clang/test/Headers/wasm.c
+++ b/clang/test/Headers/wasm.c
@@ -1475,8 +1475,8 @@ v128_t test_f64x2_ge(v128_t a, v128_t b) {
 
 // CHECK-LABEL: @test_v128_not(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[NOT_I:%.*]] = xor <4 x i32> [[A:%.*]], 
-// CHECK-NEXT:ret <4 x i32> [[NOT_I]]
+// CHECK-NEXT:[[NEG_I:%.*]] = xor <4 x i32> [[A:%.*]], 
+// CHECK-NEXT:ret <4 x i32> [[NEG_I]]
 //
 v128_t test_v128_not(v128_t a) {
   return wasm_v128_not(a);
@@ -1511,8 +1511,8 @@ v128_t test_v128_xor(v128_t a, v128_t b) {
 
 // CHECK-LABEL: @test_v128_andnot(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[NOT_I:%.*]] = xor <4 x i32> [[B:%.*]], 
-// CHECK-NEXT:[[AND_I:%.*]] = and <4 x i32> [[NOT_I]], [[A:%.*]]
+// CHECK-NEXT:[[NEG_I:%.*]] = xor <4 x i32> [[B:%.*]], 
+// CHECK-NEXT:[[AND_I:%.*]] = and <4 x i32> [[NEG_I]], [[A:%.*]]
 // CHECK-NEXT:ret <4 x i32> [[AND_I]]
 //
 v128_t test_v128_andnot(v128_t a, v128_t b) {
@@ -1596,11 +1596,12 @@ v128_t test_i8x16_popcnt(v128_t a) {
 // CHECK-LABEL: @test_i8x16_shl(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[VEC___B_I:%.*]] = bitcast i32 [[B:%.*]] to <4 x i8>
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <4 x i8> [[VEC___B_I]], <4 
x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
+// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], 
i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x 
i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP1:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP1]]
+// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
 v128_t test_i8x16_shl(v128_t a, uint32_t b) {
   return wasm_i8x16_shl(a, b);
@@ -1609,11 +1610,12 @@ v128_t test_i8x16_shl(v128_t a, uint32_t b) {
 // CHECK-LABEL: @test_i8x16_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[VEC___B_I:%.*]] = bitcast i32 [[B:%.*]] to <4 x i8>
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <4 x i8> [[VEC___B_I]], <4 
x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
+// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], 
i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x 
i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP1:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP1]]
+// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP3]]
 //
 v128_t test_i8x16_shr(v128_t a, uint32_t b) {
   return wasm_i8x16_shr(a, b);
@@ -1622,11 +1624,12 @@ v128_t test_i8x16_shr(v128_t a, uint32_t b) {
 // CHECK-LABEL: @test_u8x16_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[VEC___B_I:%.*]] = bitcast i32 [[B:%.*]] to <4 x i8>
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <4 x i8> [[VEC___B_I]], <4 
x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
+// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], 
i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <1

[clang] e71b81c - [InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 (2nd try)

2022-11-30 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2022-11-30T14:52:20-05:00
New Revision: e71b81cab09bf33e3b08ed600418b72cc4117461

URL: 
https://github.com/llvm/llvm-project/commit/e71b81cab09bf33e3b08ed600418b72cc4117461
DIFF: 
https://github.com/llvm/llvm-project/commit/e71b81cab09bf33e3b08ed600418b72cc4117461.diff

LOG: [InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 1 
(2nd try)

The first attempt was reverted because a clang test changed
unexpectedly - the file is already marked with a FIXME, so
I just updated it this time to pass.

Original commit message:
This is the main patch for converting a truncated scalar that is
inserted into a vector to bitcast+shuffle. We could go either way
on patterns like this, but this direction will allow collapsing a
pair of these sequences on the motivating example from issue

The patch is split into 3 parts to make it easier to see the
progression of tests diffs. We allow inserting/shuffling into a
different size vector for flexibility, so there are several test
variations. The length-changing is handled by shortening/padding
the shuffle mask with undef elements.

In part 1, handle the basic pattern:
inselt undef, (trunc T), IndexC --> shuffle (bitcast T), IdentityMask

Proof for the endian-dependency behaving as expected:
https://alive2.llvm.org/ce/z/BsA7yC

The TODO items for handling shifts and insert into an arbitrary base
vector value are implemented as follow-ups.

Differential Revision: https://reviews.llvm.org/D138872

Added: 


Modified: 
clang/test/Headers/wasm.c
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
llvm/test/Transforms/InstCombine/insert-trunc.ll
llvm/test/Transforms/InstCombine/vec_phi_extract-inseltpoison.ll
llvm/test/Transforms/InstCombine/vec_phi_extract.ll
llvm/test/Transforms/LoopVectorize/ARM/pointer_iv.ll
llvm/test/Transforms/PhaseOrdering/X86/vec-load-combine.ll

Removed: 




diff  --git a/clang/test/Headers/wasm.c b/clang/test/Headers/wasm.c
index 53acbf4de4c96..79dc67eaa4ef8 100644
--- a/clang/test/Headers/wasm.c
+++ b/clang/test/Headers/wasm.c
@@ -1475,8 +1475,8 @@ v128_t test_f64x2_ge(v128_t a, v128_t b) {
 
 // CHECK-LABEL: @test_v128_not(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[NEG_I:%.*]] = xor <4 x i32> [[A:%.*]], 
-// CHECK-NEXT:ret <4 x i32> [[NEG_I]]
+// CHECK-NEXT:[[NOT_I:%.*]] = xor <4 x i32> [[A:%.*]], 
+// CHECK-NEXT:ret <4 x i32> [[NOT_I]]
 //
 v128_t test_v128_not(v128_t a) {
   return wasm_v128_not(a);
@@ -1511,8 +1511,8 @@ v128_t test_v128_xor(v128_t a, v128_t b) {
 
 // CHECK-LABEL: @test_v128_andnot(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[NEG_I:%.*]] = xor <4 x i32> [[B:%.*]], 
-// CHECK-NEXT:[[AND_I:%.*]] = and <4 x i32> [[NEG_I]], [[A:%.*]]
+// CHECK-NEXT:[[NOT_I:%.*]] = xor <4 x i32> [[B:%.*]], 
+// CHECK-NEXT:[[AND_I:%.*]] = and <4 x i32> [[NOT_I]], [[A:%.*]]
 // CHECK-NEXT:ret <4 x i32> [[AND_I]]
 //
 v128_t test_v128_andnot(v128_t a, v128_t b) {
@@ -1596,12 +1596,11 @@ v128_t test_i8x16_popcnt(v128_t a) {
 // CHECK-LABEL: @test_i8x16_shl(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], 
i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x 
i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[VEC___B_I:%.*]] = bitcast i32 [[B:%.*]] to <4 x i8>
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <4 x i8> [[VEC___B_I]], <4 
x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
 v128_t test_i8x16_shl(v128_t a, uint32_t b) {
   return wasm_i8x16_shl(a, b);
@@ -1610,12 +1609,11 @@ v128_t test_i8x16_shl(v128_t a, uint32_t b) {
 // CHECK-LABEL: @test_i8x16_shr(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
-// CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], 
i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x 
i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[VEC___B_I:%.*]] = bitcast i32 [[B:%.*]] to <4 x i8>
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <4 x i8> [[VEC___B_I]], <4 
x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <

[clang] cdf3de4 - [CodeGen] fix misnamed "not" operation; NFC

2022-08-31 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2022-08-31T15:11:48-04:00
New Revision: cdf3de45d282694290011a2949bdcc61cabb47ef

URL: 
https://github.com/llvm/llvm-project/commit/cdf3de45d282694290011a2949bdcc61cabb47ef
DIFF: 
https://github.com/llvm/llvm-project/commit/cdf3de45d282694290011a2949bdcc61cabb47ef.diff

LOG: [CodeGen] fix misnamed "not" operation; NFC

Seeing the wrong instruction for this name in IR is confusing.
Most of the tests are not even checking a subsequent use of
the value, so I just deleted the over-specified CHECKs.

Added: 


Modified: 
clang/lib/CodeGen/CGExprScalar.cpp
clang/test/CodeGen/PowerPC/builtins-ppc-p10vector.c
clang/test/CodeGen/X86/avx512f-builtins.c

Removed: 




diff  --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 9def1285fbc1d..a724f8b6afd76 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -2904,7 +2904,7 @@ Value *ScalarExprEmitter::VisitMinus(const UnaryOperator 
*E,
 Value *ScalarExprEmitter::VisitUnaryNot(const UnaryOperator *E) {
   TestAndClearIgnoreResultAssign();
   Value *Op = Visit(E->getSubExpr());
-  return Builder.CreateNot(Op, "neg");
+  return Builder.CreateNot(Op, "not");
 }
 
 Value *ScalarExprEmitter::VisitUnaryLNot(const UnaryOperator *E) {

diff  --git a/clang/test/CodeGen/PowerPC/builtins-ppc-p10vector.c 
b/clang/test/CodeGen/PowerPC/builtins-ppc-p10vector.c
index 694d2795d335b..312b5fe1894ea 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-p10vector.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-p10vector.c
@@ -1863,15 +1863,15 @@ vector bool __int128 test_vec_cmpeq_bool_int128(void) {
 vector bool __int128 test_vec_cmpne_s128(void) {
   // CHECK-LABEL: @test_vec_cmpne_s128(
   // CHECK: call <1 x i128> @llvm.ppc.altivec.vcmpequq(<1 x i128>
-  // CHECK-NEXT: %neg.i = xor <1 x i128> %4, 
-  // CHECK-NEXT: ret <1 x i128> %neg.i
+  // CHECK-NEXT: %not.i = xor <1 x i128> %4, 
+  // CHECK-NEXT: ret <1 x i128> %not.i
   return vec_cmpne(vsi128a, vsi128b);
 }
 
 vector bool __int128 test_vec_cmpne_u128(void) {
   // CHECK-LABEL: @test_vec_cmpne_u128(
   // CHECK: call <1 x i128> @llvm.ppc.altivec.vcmpequq(<1 x i128>
-  // CHECK-NEXT: %neg.i = xor <1 x i128> %4, 
+  // CHECK-NEXT: xor <1 x i128> %4, 
   // CHECK-NEXT: ret <1 x i128>
   return vec_cmpne(vui128a, vui128b);
 }
@@ -1879,7 +1879,7 @@ vector bool __int128 test_vec_cmpne_u128(void) {
 vector bool __int128 test_vec_cmpne_bool_int128(void) {
   // CHECK-LABEL: @test_vec_cmpne_bool_int128(
   // CHECK: call <1 x i128> @llvm.ppc.altivec.vcmpequq(<1 x i128>
-  // CHECK-NEXT: %neg.i = xor <1 x i128> %4, 
+  // CHECK-NEXT: xor <1 x i128> %4, 
   // CHECK-NEXT: ret <1 x i128>
   return vec_cmpne(vbi128a, vbi128b);
 }
@@ -1915,7 +1915,7 @@ vector bool __int128 test_vec_cmplt_u128(void) {
 vector bool __int128 test_vec_cmpge_s128(void) {
   // CHECK-LABEL: @test_vec_cmpge_s128(
   // CHECK: call <1 x i128> @llvm.ppc.altivec.vcmpgtsq(<1 x i128>
-  // CHECK-NEXT: %neg.i = xor <1 x i128> %6, 
+  // CHECK-NEXT: xor <1 x i128> %6, 
   // CHECK-NEXT: ret <1 x i128>
   return vec_cmpge(vsi128a, vsi128b);
 }
@@ -1923,7 +1923,7 @@ vector bool __int128 test_vec_cmpge_s128(void) {
 vector bool __int128 test_vec_cmpge_u128(void) {
   // CHECK-LABEL: @test_vec_cmpge_u128(
   // CHECK: call <1 x i128> @llvm.ppc.altivec.vcmpgtuq(<1 x i128>
-  // CHECK-NEXT: %neg.i = xor <1 x i128> %6, 
+  // CHECK-NEXT: xor <1 x i128> %6, 
   // CHECK-NEXT: ret <1 x i128>
   return vec_cmpge(vui128a, vui128b);
 }
@@ -1931,7 +1931,7 @@ vector bool __int128 test_vec_cmpge_u128(void) {
 vector bool __int128 test_vec_cmple_s128(void) {
   // CHECK-LABEL: @test_vec_cmple_s128(
   // CHECK: call <1 x i128> @llvm.ppc.altivec.vcmpgtsq(<1 x i128>
-  // CHECK-NEXT: %neg.i.i = xor <1 x i128> %8, 
+  // CHECK-NEXT: xor <1 x i128> %8, 
   // CHECK-NEXT: ret <1 x i128>
   return vec_cmple(vsi128a, vsi128b);
 }
@@ -1939,7 +1939,7 @@ vector bool __int128 test_vec_cmple_s128(void) {
 vector bool __int128 test_vec_cmple_u128(void) {
   // CHECK-LABEL: @test_vec_cmple_u128(
   // CHECK: call <1 x i128> @llvm.ppc.altivec.vcmpgtuq(<1 x i128>
-  // CHECK-NEXT: %neg.i.i = xor <1 x i128> %8, 
+  // CHECK-NEXT: xor <1 x i128> %8, 
   // CHECK-NEXT: ret <1 x i128>
   return vec_cmple(vui128a, vui128b);
 }

diff  --git a/clang/test/CodeGen/X86/avx512f-builtins.c 
b/clang/test/CodeGen/X86/avx512f-builtins.c
index a803bcfff156c..8a0c273415275 100644
--- a/clang/test/CodeGen/X86/avx512f-builtins.c
+++ b/clang/test/CodeGen/X86/avx512f-builtins.c
@@ -2866,9 +2866,9 @@ __m512i test_mm512_andnot_si512(__m512i __A, __m512i __B)
 {
   //CHECK-LABEL: @test_mm512_andnot_si512
   //CHECK: load {{.*}}%__A.addr.i, align 64
-  //CHECK: %neg.i = xor{{.*}}, 
+  //CHECK: %not.i = xor{{.*}}, 
   //CHECK: load {{.*}}%__B.addr.i, align 64
-  //CHECK: and <8 x i64> %neg.i,{{.*}}
+  //CHECK: and <8 x i64> %not.i,{{

[clang] ab982ea - [Sema] add warning for tautological FP compare with literal

2022-03-17 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2022-03-17T08:22:30-04:00
New Revision: ab982eace6e4951a2986567d29f4d6be002c1ba7

URL: 
https://github.com/llvm/llvm-project/commit/ab982eace6e4951a2986567d29f4d6be002c1ba7
DIFF: 
https://github.com/llvm/llvm-project/commit/ab982eace6e4951a2986567d29f4d6be002c1ba7.diff

LOG: [Sema] add warning for tautological FP compare with literal

If we are equality comparing an FP literal with a value cast from a type
where the literal can't be represented, that's known true or false and
probably a programmer error.

Fixes issue #54222.
https://github.com/llvm/llvm-project/issues/54222

Note - I added the optimizer change with:
9397bdc67eb2
...and as discussed in the post-commit comments, that transform might be
too dangerous without this warning in place, so it was reverted to allow
this change first.

Differential Revision: https://reviews.llvm.org/D121306

Added: 


Modified: 
clang/docs/ReleaseNotes.rst
clang/include/clang/Basic/DiagnosticSemaKinds.td
clang/include/clang/Sema/Sema.h
clang/lib/Sema/SemaChecking.cpp
clang/lib/Sema/SemaExpr.cpp
clang/test/Sema/floating-point-compare.c

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index a0e3cabe89a9a..d457be1305cf7 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -64,6 +64,9 @@ Bug Fixes
 
 Improvements to Clang's diagnostics
 ^^^
+- ``-Wliteral-range`` will warn on floating-point equality comparisons with
+  constants that are not representable in a casted value. For example,
+  ``(float) f == 0.1`` is always false.
 
 Non-comprehensive list of changes in this release
 -

diff  --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index eda47588680f4..e6b11c943a705 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -118,6 +118,10 @@ def warn_float_overflow : Warning<
 def warn_float_underflow : Warning<
   "magnitude of floating-point constant too small for type %0; minimum is %1">,
   InGroup;
+def warn_float_compare_literal : Warning<
+  "floating-point comparison is always %select{true|false}0; "
+  "constant cannot be represented exactly in type %1">,
+  InGroup;
 def warn_double_const_requires_fp64 : Warning<
   "double precision constant requires %select{cl_khr_fp64|cl_khr_fp64 and 
__opencl_c_fp64}0, "
   "casting to single precision">;

diff  --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index e8b9bc2d7990c..fe8a1f371fe74 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -12928,7 +12928,8 @@ class Sema final {
   const FunctionDecl *FD = nullptr);
 
 public:
-  void CheckFloatComparison(SourceLocation Loc, Expr *LHS, Expr *RHS);
+  void CheckFloatComparison(SourceLocation Loc, Expr *LHS, Expr *RHS,
+BinaryOperatorKind Opcode);
 
 private:
   void CheckImplicitConversions(Expr *E, SourceLocation CC = SourceLocation());

diff  --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 2d14019cdbf18..2d2250771eb6e 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -11436,12 +11436,40 @@ Sema::CheckReturnValExpr(Expr *RetValExp, QualType 
lhsType,
 CheckPPCMMAType(RetValExp->getType(), ReturnLoc);
 }
 
-//===--- CHECK: Floating-Point comparisons (-Wfloat-equal) ---===//
+/// Check for comparisons of floating-point values using == and !=. Issue a
+/// warning if the comparison is not likely to do what the programmer intended.
+void Sema::CheckFloatComparison(SourceLocation Loc, Expr *LHS, Expr *RHS,
+BinaryOperatorKind Opcode) {
+  // Match and capture subexpressions such as "(float) X == 0.1".
+  FloatingLiteral *FPLiteral;
+  CastExpr *FPCast;
+  auto getCastAndLiteral = [&FPLiteral, &FPCast](Expr *L, Expr *R) {
+FPLiteral = dyn_cast(L->IgnoreParens());
+FPCast = dyn_cast(R->IgnoreParens());
+return FPLiteral && FPCast;
+  };
+
+  if (getCastAndLiteral(LHS, RHS) || getCastAndLiteral(RHS, LHS)) {
+auto *SourceTy = FPCast->getSubExpr()->getType()->getAs();
+auto *TargetTy = FPLiteral->getType()->getAs();
+if (SourceTy && TargetTy && SourceTy->isFloatingPoint() &&
+TargetTy->isFloatingPoint()) {
+  bool Lossy;
+  llvm::APFloat TargetC = FPLiteral->getValue();
+  TargetC.convert(Context.getFloatTypeSemantics(QualType(SourceTy, 0)),
+  llvm::APFloat::rmNearestTiesToEven, &Lossy);
+  if (Lossy) {
+// If the literal cannot be represented in the source type, then a
+// check for == is always false and check for != is always true.
+Diag(Loc, diag::warn_float_

[clang] 1965cc4 - [CodeGen] remove creation of FP cast function attribute

2021-12-19 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-12-19T11:55:00-05:00
New Revision: 1965cc469539979d66c6a7f9d1c73000a795f8f0

URL: 
https://github.com/llvm/llvm-project/commit/1965cc469539979d66c6a7f9d1c73000a795f8f0
DIFF: 
https://github.com/llvm/llvm-project/commit/1965cc469539979d66c6a7f9d1c73000a795f8f0.diff

LOG: [CodeGen] remove creation of FP cast function attribute

This is the last cleanup step resulting from D115804 .
Now that clang uses intrinsics when we're in the special FP mode,
we don't need a function attribute as an indicator to the backend.
The LLVM part of the change is in D115885.

Differential Revision: https://reviews.llvm.org/D115886

Added: 


Modified: 
clang/lib/CodeGen/CGCall.cpp
clang/test/CodeGen/no-junk-ftrunc.c

Removed: 




diff  --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index e4526ff30bdd8..b202326cf7575 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1831,11 +1831,6 @@ void 
CodeGenModule::getDefaultFunctionAttributes(StringRef Name,
 if (LangOpts.getFPExceptionMode() == LangOptions::FPE_Ignore)
   FuncAttrs.addAttribute("no-trapping-math", "true");
 
-// Strict (compliant) code is the default, so only add this attribute to
-// indicate that we are trying to workaround a problem case.
-if (!CodeGenOpts.StrictFloatCastOverflow)
-  FuncAttrs.addAttribute("strict-float-cast-overflow", "false");
-
 // TODO: Are these all needed?
 // unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 if (LangOpts.NoHonorInfs)

diff  --git a/clang/test/CodeGen/no-junk-ftrunc.c 
b/clang/test/CodeGen/no-junk-ftrunc.c
index 6ae6d30fca434..62491fbfa7cd4 100644
--- a/clang/test/CodeGen/no-junk-ftrunc.c
+++ b/clang/test/CodeGen/no-junk-ftrunc.c
@@ -1,11 +1,12 @@
 // RUN: %clang_cc1 -S -fno-strict-float-cast-overflow %s -emit-llvm -o - | 
FileCheck %s --check-prefix=NOSTRICT
 
 // When compiling with non-standard semantics, use intrinsics to inhibit the 
optimizer.
+// This used to require a function attribute, so we check that it is NOT here 
anymore.
 
 // NOSTRICT-LABEL: main
 // NOSTRICT: call i32 @llvm.fptosi.sat.i32.f64
 // NOSTRICT: call i32 @llvm.fptoui.sat.i32.f64
-// NOSTRICT: attributes #0 = {{.*}}"strict-float-cast-overflow"="false"{{.*}}
+// NOSTRICT-NOT: strict-float-cast-overflow
 
 // The workaround attribute is not applied by default.
 



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 8c7f2a4 - [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-16 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-12-16T09:10:12-05:00
New Revision: 8c7f2a4f871928d8734ee3f03e67d09086850b60

URL: 
https://github.com/llvm/llvm-project/commit/8c7f2a4f871928d8734ee3f03e67d09086850b60
DIFF: 
https://github.com/llvm/llvm-project/commit/8c7f2a4f871928d8734ee3f03e67d09086850b60.diff

LOG: [CodeGen] use saturating FP casts when compiling with 
"no-strict-float-cast-overflow"

We got an unintended consequence of the optimizer getting smarter when
compiling in a non-standard mode, and there's no good way to inhibit
those optimizations at a later stage. The test is based on an example
linked from D92270.

We allow the "no-strict-float-cast-overflow" exception to normal C
cast rules to preserve legacy code that does not expect overflowing
casts from FP to int to produce UB. See D46236 for details.

Differential Revision: https://reviews.llvm.org/D115804

Added: 


Modified: 
clang/lib/CodeGen/CGExprScalar.cpp
clang/test/CodeGen/no-junk-ftrunc.c

Removed: 




diff  --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 4998d343e5f5e..d34cf7f36f678 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -1240,7 +1240,18 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, 
QualType SrcType,
 
   if (isa(DstElementTy)) {
 assert(SrcElementTy->isFloatingPointTy() && "Unknown real conversion");
-if (DstElementType->isSignedIntegerOrEnumerationType())
+bool IsSigned = DstElementType->isSignedIntegerOrEnumerationType();
+
+// If we can't recognize overflow as undefined behavior, assume that
+// overflow saturates. This protects against normal optimizations if we are
+// compiling with non-standard FP semantics.
+if (!CGF.CGM.getCodeGenOpts().StrictFloatCastOverflow) {
+  llvm::Intrinsic::ID IID =
+  IsSigned ? llvm::Intrinsic::fptosi_sat : llvm::Intrinsic::fptoui_sat;
+  return Builder.CreateCall(CGF.CGM.getIntrinsic(IID, {DstTy, SrcTy}), 
Src);
+}
+
+if (IsSigned)
   return Builder.CreateFPToSI(Src, DstTy, "conv");
 return Builder.CreateFPToUI(Src, DstTy, "conv");
   }

diff  --git a/clang/test/CodeGen/no-junk-ftrunc.c 
b/clang/test/CodeGen/no-junk-ftrunc.c
index 2ab4d8c25a39c..6ae6d30fca434 100644
--- a/clang/test/CodeGen/no-junk-ftrunc.c
+++ b/clang/test/CodeGen/no-junk-ftrunc.c
@@ -1,14 +1,22 @@
 // RUN: %clang_cc1 -S -fno-strict-float-cast-overflow %s -emit-llvm -o - | 
FileCheck %s --check-prefix=NOSTRICT
+
+// When compiling with non-standard semantics, use intrinsics to inhibit the 
optimizer.
+
 // NOSTRICT-LABEL: main
+// NOSTRICT: call i32 @llvm.fptosi.sat.i32.f64
+// NOSTRICT: call i32 @llvm.fptoui.sat.i32.f64
 // NOSTRICT: attributes #0 = {{.*}}"strict-float-cast-overflow"="false"{{.*}}
 
 // The workaround attribute is not applied by default.
 
 // RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s --check-prefix=STRICT
 // STRICT-LABEL: main
+// STRICT: = fptosi
+// STRICT: = fptoui
 // STRICT-NOT: strict-float-cast-overflow
 
+
 int main() {
-  return 0;
+  double d = 1e20;
+  return (int)d != 1e20 && (unsigned)d != 1e20;
 }
-



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 1a60ae0 - [InstCombine] fold mask-with-signbit-splat to icmp+select

2021-12-14 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-12-14T16:00:42-05:00
New Revision: 1a60ae02c65d26981017f59bc5918d3c2e363bfd

URL: 
https://github.com/llvm/llvm-project/commit/1a60ae02c65d26981017f59bc5918d3c2e363bfd
DIFF: 
https://github.com/llvm/llvm-project/commit/1a60ae02c65d26981017f59bc5918d3c2e363bfd.diff

LOG: [InstCombine] fold mask-with-signbit-splat to icmp+select

~(iN X s>> (N-1)) & Y --> (X s< 0) ? 0 : Y

https://alive2.llvm.org/ce/z/JKlQ9x

This is similar to D111410 / 727e642e970d028049d ,
but it includes a 'not' of the signbit and so it
saves an instruction in the basic pattern.

DAGCombiner or target-specific folds can expand
this back into bit-hacks.

The diffs in the logical-select tests are not true
regressions - running early-cse and another round
of instcombine is expected in a normal opt pipeline,
and that reduces back to a minimal form as shown
in the duplicated PhaseOrdering test.

I have no understanding of the SystemZ diffs, so
I made the minimal edits suggested by FileCheck to
make that test pass again. That whole test file is
wrong though. It is running the entire optimizer (-O2)
to check IR, and then topping that by even running
codegen and checking asm. It needs to be split up.

Fixes #52631

Added: 


Modified: 
clang/test/CodeGen/SystemZ/builtins-systemz-zvector.c
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
llvm/test/Transforms/InstCombine/and.ll
llvm/test/Transforms/InstCombine/logical-select-inseltpoison.ll
llvm/test/Transforms/InstCombine/logical-select.ll
llvm/test/Transforms/InstCombine/vec_sext.ll
llvm/test/Transforms/PhaseOrdering/vector-select.ll

Removed: 




diff  --git a/clang/test/CodeGen/SystemZ/builtins-systemz-zvector.c 
b/clang/test/CodeGen/SystemZ/builtins-systemz-zvector.c
index 7cd4a951741f0..38f0c2908825a 100644
--- a/clang/test/CodeGen/SystemZ/builtins-systemz-zvector.c
+++ b/clang/test/CodeGen/SystemZ/builtins-systemz-zvector.c
@@ -3289,13 +3289,13 @@ void test_integer(void) {
   // CHECK-ASM: vsrlb
 
   vsc = vec_abs(vsc);
-  // CHECK-ASM: vlpb
+  // CHECK-ASM: vlcb
   vss = vec_abs(vss);
-  // CHECK-ASM: vlph
+  // CHECK-ASM: vlch
   vsi = vec_abs(vsi);
-  // CHECK-ASM: vlpf
+  // CHECK-ASM: vlcf
   vsl = vec_abs(vsl);
-  // CHECK-ASM: vlpg
+  // CHECK-ASM: vlcg
 
   vsc = vec_max(vsc, vsc);
   // CHECK-ASM: vmxb

diff  --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index 9023619b14280..08cd1a7f97e60 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -2133,6 +2133,15 @@ Instruction *InstCombinerImpl::visitAnd(BinaryOperator 
&I) {
 Value *Cmp = Builder.CreateICmpSLT(X, Zero, "isneg");
 return SelectInst::Create(Cmp, Y, Zero);
   }
+  // If there's a 'not' of the shifted value, swap the select operands:
+  // ~(iN X s>> (N-1)) & Y --> (X s< 0) ? 0 : Y
+  if (match(&I, m_c_And(m_OneUse(m_Not(
+m_AShr(m_Value(X), m_SpecificInt(FullShift,
+m_Value(Y {
+Constant *Zero = ConstantInt::getNullValue(Ty);
+Value *Cmp = Builder.CreateICmpSLT(X, Zero, "isneg");
+return SelectInst::Create(Cmp, Zero, Y);
+  }
 
   // (~x) & y  -->  ~(x | (~y))  iff that gets rid of inversions
   if (sinkNotIntoOtherHandOfAndOrOr(I))

diff  --git a/llvm/test/Transforms/InstCombine/and.ll 
b/llvm/test/Transforms/InstCombine/and.ll
index edaef78b631d8..53c7f09189ff5 100644
--- a/llvm/test/Transforms/InstCombine/and.ll
+++ b/llvm/test/Transforms/InstCombine/and.ll
@@ -1463,9 +1463,8 @@ define i8 @lshr_bitwidth_mask(i8 %x, i8 %y) {
 
 define i8 @not_ashr_bitwidth_mask(i8 %x, i8 %y) {
 ; CHECK-LABEL: @not_ashr_bitwidth_mask(
-; CHECK-NEXT:[[SIGN:%.*]] = ashr i8 [[X:%.*]], 7
-; CHECK-NEXT:[[NOT:%.*]] = xor i8 [[SIGN]], -1
-; CHECK-NEXT:[[POS_OR_ZERO:%.*]] = and i8 [[NOT]], [[Y:%.*]]
+; CHECK-NEXT:[[ISNEG:%.*]] = icmp slt i8 [[X:%.*]], 0
+; CHECK-NEXT:[[POS_OR_ZERO:%.*]] = select i1 [[ISNEG]], i8 0, i8 [[Y:%.*]]
 ; CHECK-NEXT:ret i8 [[POS_OR_ZERO]]
 ;
   %sign = ashr i8 %x, 7
@@ -1477,9 +1476,8 @@ define i8 @not_ashr_bitwidth_mask(i8 %x, i8 %y) {
 define <2 x i8> @not_ashr_bitwidth_mask_vec_commute(<2 x i8> %x, <2 x i8> %py) 
{
 ; CHECK-LABEL: @not_ashr_bitwidth_mask_vec_commute(
 ; CHECK-NEXT:[[Y:%.*]] = mul <2 x i8> [[PY:%.*]], 
-; CHECK-NEXT:[[SIGN:%.*]] = ashr <2 x i8> [[X:%.*]], 
-; CHECK-NEXT:[[NOT:%.*]] = xor <2 x i8> [[SIGN]], 
-; CHECK-NEXT:[[POS_OR_ZERO:%.*]] = and <2 x i8> [[Y]], [[NOT]]
+; CHECK-NEXT:[[ISNEG:%.*]] = icmp slt <2 x i8> [[X:%.*]], zeroinitializer
+; CHECK-NEXT:[[POS_OR_ZERO:%.*]] = select <2 x i1> [[ISNEG]], <2 x i8> 
zeroinitializer, <2 x i8> [[Y]]
 ; CHECK-NEXT:ret <2 x i8> [[POS_OR_ZERO]]
 ;
   %y = mul <2 x i8> %py,   ; thwart complexity-based ordering
@@ -1

[clang] 1ee851c - Revert "[CodeGen] regenerate test checks; NFC"

2021-09-22 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-09-22T07:45:21-04:00
New Revision: 1ee851c5859fdb36eca57a46347a1e7b8e1ff236

URL: 
https://github.com/llvm/llvm-project/commit/1ee851c5859fdb36eca57a46347a1e7b8e1ff236
DIFF: 
https://github.com/llvm/llvm-project/commit/1ee851c5859fdb36eca57a46347a1e7b8e1ff236.diff

LOG: Revert "[CodeGen] regenerate test checks; NFC"

This reverts commit 52832cd917af00e2b9c6a9d1476ba79754dcabff.
The motivating commit 2f6b07316f5 caused several bots to hit
an infinite loop at stage 2, so that needs to be reverted too
while figuring out how to fix that.

Added: 


Modified: 
clang/test/CodeGen/aapcs-bitfield.c

Removed: 




diff  --git a/clang/test/CodeGen/aapcs-bitfield.c 
b/clang/test/CodeGen/aapcs-bitfield.c
index 316986c764bc..13db68d6ae81 100644
--- a/clang/test/CodeGen/aapcs-bitfield.c
+++ b/clang/test/CodeGen/aapcs-bitfield.c
@@ -1034,7 +1034,7 @@ struct st6 {
 // LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
 // LE-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* 
[[M]], i32 0, i32 1
-// LE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
[[TBAA3:![0-9]+]]
+// LE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa !3
 // LE-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // LE-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // LE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* 
[[M]], i32 0, i32 2
@@ -1052,7 +1052,7 @@ struct st6 {
 // BE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
 // BE-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* 
[[M]], i32 0, i32 1
-// BE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
[[TBAA3:![0-9]+]]
+// BE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa !3
 // BE-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // BE-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // BE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* 
[[M]], i32 0, i32 2
@@ -1070,7 +1070,7 @@ struct st6 {
 // LENUMLOADS-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
 // LENUMLOADS-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LENUMLOADS-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 1
-// LENUMLOADS-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, 
!tbaa [[TBAA3:![0-9]+]]
+// LENUMLOADS-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, 
!tbaa !3
 // LENUMLOADS-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // LENUMLOADS-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // LENUMLOADS-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 2
@@ -1088,7 +1088,7 @@ struct st6 {
 // BENUMLOADS-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
 // BENUMLOADS-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BENUMLOADS-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 1
-// BENUMLOADS-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, 
!tbaa [[TBAA3:![0-9]+]]
+// BENUMLOADS-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, 
!tbaa !3
 // BENUMLOADS-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // BENUMLOADS-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // BENUMLOADS-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 2
@@ -1106,7 +1106,7 @@ struct st6 {
 // LEWIDTH-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
 // LEWIDTH-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LEWIDTH-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 1
-// LEWIDTH-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
[[TBAA3:![0-9]+]]
+// LEWIDTH-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
!3
 // LEWIDTH-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // LEWIDTH-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // LEWIDTH-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 2
@@ -1124,7 +1124,7 @@ struct st6 {
 // BEWIDTH-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
 // BEWIDTH-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BEWIDTH-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 1
-// BEWIDTH-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
[[TBAA3:![0-9]+]]
+// BEWIDTH-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
!3
 // BEWIDTH-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // BEWIDTH-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // BEWIDTH-NEXT:[[C:%.*]] = getele

[clang] 52832cd - [CodeGen] regenerate test checks; NFC

2021-09-21 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-09-21T16:53:41-04:00
New Revision: 52832cd917af00e2b9c6a9d1476ba79754dcabff

URL: 
https://github.com/llvm/llvm-project/commit/52832cd917af00e2b9c6a9d1476ba79754dcabff
DIFF: 
https://github.com/llvm/llvm-project/commit/52832cd917af00e2b9c6a9d1476ba79754dcabff.diff

LOG: [CodeGen] regenerate test checks; NFC

This broke with 2f6b07316f56 because it wrongly runs the entire LLVM optimizer.

Added: 


Modified: 
clang/test/CodeGen/aapcs-bitfield.c

Removed: 




diff  --git a/clang/test/CodeGen/aapcs-bitfield.c 
b/clang/test/CodeGen/aapcs-bitfield.c
index 13db68d6ae81..316986c764bc 100644
--- a/clang/test/CodeGen/aapcs-bitfield.c
+++ b/clang/test/CodeGen/aapcs-bitfield.c
@@ -1034,7 +1034,7 @@ struct st6 {
 // LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
 // LE-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* 
[[M]], i32 0, i32 1
-// LE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa !3
+// LE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
[[TBAA3:![0-9]+]]
 // LE-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // LE-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // LE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* 
[[M]], i32 0, i32 2
@@ -1052,7 +1052,7 @@ struct st6 {
 // BE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
 // BE-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BE-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* 
[[M]], i32 0, i32 1
-// BE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa !3
+// BE-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
[[TBAA3:![0-9]+]]
 // BE-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // BE-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // BE-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], %struct.st6* 
[[M]], i32 0, i32 2
@@ -1070,7 +1070,7 @@ struct st6 {
 // LENUMLOADS-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
 // LENUMLOADS-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LENUMLOADS-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 1
-// LENUMLOADS-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, 
!tbaa !3
+// LENUMLOADS-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, 
!tbaa [[TBAA3:![0-9]+]]
 // LENUMLOADS-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // LENUMLOADS-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // LENUMLOADS-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 2
@@ -1088,7 +1088,7 @@ struct st6 {
 // BENUMLOADS-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
 // BENUMLOADS-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BENUMLOADS-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 1
-// BENUMLOADS-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, 
!tbaa !3
+// BENUMLOADS-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, 
!tbaa [[TBAA3:![0-9]+]]
 // BENUMLOADS-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // BENUMLOADS-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // BENUMLOADS-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 2
@@ -1106,7 +1106,7 @@ struct st6 {
 // LEWIDTH-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 4
 // LEWIDTH-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // LEWIDTH-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 1
-// LEWIDTH-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
!3
+// LEWIDTH-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
[[TBAA3:![0-9]+]]
 // LEWIDTH-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // LEWIDTH-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // LEWIDTH-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 2
@@ -1124,7 +1124,7 @@ struct st6 {
 // BEWIDTH-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 4
 // BEWIDTH-NEXT:[[BF_CAST:%.*]] = sext i16 [[BF_ASHR]] to i32
 // BEWIDTH-NEXT:[[B:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 1
-// BEWIDTH-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
!3
+// BEWIDTH-NEXT:[[TMP1:%.*]] = load volatile i8, i8* [[B]], align 2, !tbaa 
[[TBAA3:![0-9]+]]
 // BEWIDTH-NEXT:[[CONV:%.*]] = sext i8 [[TMP1]] to i32
 // BEWIDTH-NEXT:[[ADD:%.*]] = add nsw i32 [[BF_CAST]], [[CONV]]
 // BEWIDTH-NEXT:[[C:%.*]] = getelementptr inbounds [[STRUCT_ST6]], 
%struct.st6* [[M]], i32 0, i32 2
@@ -1142,7 +1142,7 @@ struct st6 {
 // LEWIDTHNUM-NEXT:[[BF_ASHR:%.*]] = ashr e

[clang] cc86b87 - [CodeGen] limit tests to current pass manager to avoid variability; NFC

2021-06-10 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-06-10T08:50:06-04:00
New Revision: cc86b87a57000ba673edaf95f65913412928f003

URL: 
https://github.com/llvm/llvm-project/commit/cc86b87a57000ba673edaf95f65913412928f003
DIFF: 
https://github.com/llvm/llvm-project/commit/cc86b87a57000ba673edaf95f65913412928f003.diff

LOG: [CodeGen] limit tests to current pass manager to avoid variability; NFC

Post-commit feedback for d69c4372bfbe says the output
may vary between pass managers. This is hopefully a
quick fix, but we might want to investigate how to
better solve this type of problem.

Added: 


Modified: 
clang/test/CodeGen/aarch64-bf16-dotprod-intrinsics.c
clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
clang/test/CodeGen/aarch64-bf16-lane-intrinsics.c
clang/test/CodeGen/arm-bf16-convert-intrinsics.c
clang/test/CodeGen/arm-bf16-dotprod-intrinsics.c
clang/test/CodeGen/arm-bf16-getset-intrinsics.c

Removed: 




diff  --git a/clang/test/CodeGen/aarch64-bf16-dotprod-intrinsics.c 
b/clang/test/CodeGen/aarch64-bf16-dotprod-intrinsics.c
index 4fe6c6b43aa1..966c50f62c8b 100644
--- a/clang/test/CodeGen/aarch64-bf16-dotprod-intrinsics.c
+++ b/clang/test/CodeGen/aarch64-bf16-dotprod-intrinsics.c
@@ -1,6 +1,6 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-feature +neon 
-target-feature +bf16 \
-// RUN: -disable-O0-optnone -emit-llvm %s -o - | opt -S -mem2reg | FileCheck %s
+// RUN: -disable-O0-optnone -emit-llvm -fno-legacy-pass-manager %s -o - | opt 
-S -mem2reg | FileCheck %s
 
 #include 
 

diff  --git a/clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c 
b/clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
index 3cb9f0dd8db2..7f3bf7f1ec30 100644
--- a/clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
+++ b/clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c
@@ -1,6 +1,6 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-feature +neon 
-target-feature +bf16 \
-// RUN:  -disable-O0-optnone -emit-llvm %s -o - | opt -S -mem2reg | FileCheck 
%s
+// RUN:  -disable-O0-optnone -emit-llvm -fno-legacy-pass-manager %s -o - | opt 
-S -mem2reg | FileCheck %s
 
 #include 
 

diff  --git a/clang/test/CodeGen/aarch64-bf16-lane-intrinsics.c 
b/clang/test/CodeGen/aarch64-bf16-lane-intrinsics.c
index 694dc65cc7d6..b5a2c20f2e32 100644
--- a/clang/test/CodeGen/aarch64-bf16-lane-intrinsics.c
+++ b/clang/test/CodeGen/aarch64-bf16-lane-intrinsics.c
@@ -1,8 +1,8 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-feature +neon 
-target-feature +bf16 \
-// RUN:  -disable-O0-optnone -emit-llvm %s -o - | opt -S -mem2reg | FileCheck 
--check-prefix=CHECK-LE %s
+// RUN:  -disable-O0-optnone -emit-llvm -fno-legacy-pass-manager %s -o - | opt 
-S -mem2reg | FileCheck --check-prefix=CHECK-LE %s
 // RUN: %clang_cc1 -triple aarch64_be-arm-none-eabi -target-feature +neon 
-target-feature +bf16 \
-// RUN:  -disable-O0-optnone -emit-llvm %s -o - | opt -S -mem2reg | FileCheck 
--check-prefix=CHECK-BE %s
+// RUN:  -disable-O0-optnone -emit-llvm %s -fno-legacy-pass-manager -o - | opt 
-S -mem2reg | FileCheck --check-prefix=CHECK-BE %s
 
 #include 
 

diff  --git a/clang/test/CodeGen/arm-bf16-convert-intrinsics.c 
b/clang/test/CodeGen/arm-bf16-convert-intrinsics.c
index 75304a43f05c..e77e92115f6c 100644
--- a/clang/test/CodeGen/arm-bf16-convert-intrinsics.c
+++ b/clang/test/CodeGen/arm-bf16-convert-intrinsics.c
@@ -1,19 +1,19 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 \
 // RUN:   -triple aarch64-arm-none-eabi -target-feature +neon -target-feature 
+bf16 \
-// RUN:   -disable-O0-optnone -emit-llvm -o - %s \
+// RUN:   -disable-O0-optnone -emit-llvm -fno-legacy-pass-manager -o - %s \
 // RUN:   | opt -S -mem2reg \
 // RUN:   | FileCheck --check-prefixes=CHECK,CHECK-A64 %s
 // RUN: %clang_cc1 \
 // RUN:   -triple armv8.6a-arm-none-eabi -target-feature +neon \
 // RUN:   -target-feature +bf16 -mfloat-abi hard \
-// RUN:   -disable-O0-optnone -emit-llvm -o - %s \
+// RUN:   -disable-O0-optnone -emit-llvm -fno-legacy-pass-manager -o - %s \
 // RUN:   | opt -S -mem2reg \
 // RUN:   | FileCheck --check-prefixes=CHECK,CHECK-A32-HARDFP %s
 // RUN: %clang_cc1 \
 // RUN:   -triple armv8.6a-arm-none-eabi -target-feature +neon \
 // RUN:   -target-feature +bf16 -mfloat-abi softfp \
-// RUN:   -disable-O0-optnone -emit-llvm -o - %s \
+// RUN:   -disable-O0-optnone -emit-llvm -fno-legacy-pass-manager -o - %s \
 // RUN:   | opt -S -mem2reg \
 // RUN:   | FileCheck --check-prefixes=CHECK,CHECK-A32-SOFTFP %s
 

diff  --git a/clang/test/CodeGen/arm-bf16-dotprod-intrinsics.c 
b/clang/test/CodeGen/arm-bf16-dotprod-intrinsics.c
index 1211f2ffa732..2fdb9f1c2

[clang] 16e78ec - [Headers][WASM] adjust test that runs the optimizer; NFC

2021-05-25 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-05-25T09:17:10-04:00
New Revision: 16e78ec0b43c33c818525ea9b5d39731022f1cbb

URL: 
https://github.com/llvm/llvm-project/commit/16e78ec0b43c33c818525ea9b5d39731022f1cbb
DIFF: 
https://github.com/llvm/llvm-project/commit/16e78ec0b43c33c818525ea9b5d39731022f1cbb.diff

LOG: [Headers][WASM] adjust test that runs the optimizer; NFC

This broke with the LLVM change in 0bab0f616119

Added: 


Modified: 
clang/test/Headers/wasm.c

Removed: 




diff  --git a/clang/test/Headers/wasm.c b/clang/test/Headers/wasm.c
index 8b87eb2c0e2e..c170e75259d1 100644
--- a/clang/test/Headers/wasm.c
+++ b/clang/test/Headers/wasm.c
@@ -1,6 +1,8 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --force-update
 // REQUIRES: webassembly-registered-target, asserts
 
+// FIXME: This should not be using -O2 and implicitly testing the entire IR 
opt pipeline.
+
 // RUN: %clang %s -O2 -emit-llvm -S -o - -target wasm32-unknown-unknown 
-msimd128 -Wcast-qual -fno-lax-vector-conversions -Werror | FileCheck %s
 
 #include 
@@ -582,7 +584,7 @@ v128_t test_i64x2_replace_lane(v128_t a, int64_t b) {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[VECINIT_I:%.*]] = insertelement <4 x float> undef, float 
[[A:%.*]], i32 0
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x float> [[VECINIT_I]] to <4 x i32>
-// CHECK-NEXT:[[TMP1:%.*]] = shufflevector <4 x i32> [[TMP0]], <4 x i32> 
poison, <4 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP1:%.*]] = shufflevector <4 x i32> [[TMP0]], <4 x i32> 
undef, <4 x i32> zeroinitializer
 // CHECK-NEXT:ret <4 x i32> [[TMP1]]
 //
 v128_t test_f32x4_splat(float a) {



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 661cc71 - [PassManager][PhaseOrdering] lower expects before running simplifyCFG

2021-04-12 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-04-12T15:07:53-04:00
New Revision: 661cc71a1c50081389d73b2baae02f51df670ba1

URL: 
https://github.com/llvm/llvm-project/commit/661cc71a1c50081389d73b2baae02f51df670ba1
DIFF: 
https://github.com/llvm/llvm-project/commit/661cc71a1c50081389d73b2baae02f51df670ba1.diff

LOG: [PassManager][PhaseOrdering] lower expects before running simplifyCFG

Retry of 330619a3a623 that includes a clang test update.

Original commit message:

If we run passes before lowering llvm.expect intrinsics to metadata,
then those passes have no way to act on the hints provided by llvm.expect.
SimplifyCFG is the known offender, and we made it smarter about profile
metadata in D98898 .

In the motivating example from https://llvm.org/PR49336 , this means we
were ignoring the recommended method for a programmer to tell the compiler
that a compare+branch is expensive. This change appears to solve that case -
the metadata survives to the backend, the compare order is as expected in IR,
and the backend does not do anything to reverse it.

We make the same change to the old pass manager to keep things synchronized.

Differential Revision: https://reviews.llvm.org/D100213

Added: 


Modified: 
clang/test/CodeGen/thinlto-distributed-newpm.ll
llvm/lib/Passes/PassBuilder.cpp
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
llvm/test/CodeGen/AMDGPU/opt-pipeline.ll
llvm/test/Other/new-pm-defaults.ll
llvm/test/Other/new-pm-pgo.ll
llvm/test/Other/new-pm-thinlto-defaults.ll
llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll
llvm/test/Other/opt-O2-pipeline.ll
llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
llvm/test/Other/opt-O3-pipeline.ll
llvm/test/Other/opt-Os-pipeline.ll
llvm/test/Transforms/PhaseOrdering/expect.ll

Removed: 




diff  --git a/clang/test/CodeGen/thinlto-distributed-newpm.ll 
b/clang/test/CodeGen/thinlto-distributed-newpm.ll
index 1e9d5d4d2629c..398b65116bbd7 100644
--- a/clang/test/CodeGen/thinlto-distributed-newpm.ll
+++ b/clang/test/CodeGen/thinlto-distributed-newpm.ll
@@ -31,6 +31,7 @@
 ; CHECK-O: Running analysis: OptimizationRemarkEmitterAnalysis on main
 ; CHECK-O: Running pass: InferFunctionAttrsPass
 ; CHECK-O: Starting {{.*}}Function pass manager run.
+; CHECK-O: Running pass: LowerExpectIntrinsicPass on main
 ; CHECK-O: Running pass: SimplifyCFGPass on main
 ; CHECK-O: Running analysis: TargetIRAnalysis on main
 ; CHECK-O: Running analysis: AssumptionAnalysis on main
@@ -38,7 +39,6 @@
 ; CHECK-O: Running analysis: DominatorTreeAnalysis on main
 ; CHECK-O: Running pass: EarlyCSEPass on main
 ; CHECK-O: Running analysis: TargetLibraryAnalysis on main
-; CHECK-O: Running pass: LowerExpectIntrinsicPass on main
 ; CHECK-O3: Running pass: CallSiteSplittingPass on main
 ; CHECK-O: Finished {{.*}}Function pass manager run.
 ; CHECK-O: Running pass: LowerTypeTestsPass

diff  --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 6307e468e7017..75ba9da4214d7 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -1066,10 +1066,12 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // Create an early function pass manager to cleanup the output of the
   // frontend.
   FunctionPassManager EarlyFPM(DebugLogging);
+  // Lower llvm.expect to metadata before attempting transforms.
+  // Compare/branch metadata may alter the behavior of passes like SimplifyCFG.
+  EarlyFPM.addPass(LowerExpectIntrinsicPass());
   EarlyFPM.addPass(SimplifyCFGPass());
   EarlyFPM.addPass(SROA());
   EarlyFPM.addPass(EarlyCSEPass());
-  EarlyFPM.addPass(LowerExpectIntrinsicPass());
   if (PTO.Coroutines)
 EarlyFPM.addPass(CoroEarlyPass());
   if (Level == OptimizationLevel::O3)

diff  --git a/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp 
b/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
index 2c80a16febeff..19e212f738ade 100644
--- a/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
+++ b/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
@@ -316,10 +316,12 @@ void PassManagerBuilder::populateFunctionPassManager(
 
   addInitialAliasAnalysisPasses(FPM);
 
+  // Lower llvm.expect to metadata before attempting transforms.
+  // Compare/branch metadata may alter the behavior of passes like SimplifyCFG.
+  FPM.add(createLowerExpectIntrinsicPass());
   FPM.add(createCFGSimplificationPass());
   FPM.add(createSROAPass());
   FPM.add(createEarlyCSEPass());
-  FPM.add(createLowerExpectIntrinsicPass());
 }
 
 // Do PGO instrumentation generation or use pass as the option specified.

diff  --git a/llvm/test/CodeGen/AMDGPU/opt-pipeline.ll 
b/llvm/test/CodeGen/AMDGPU/opt-pipeline.ll
index 5a0531017

[clang] ee8b538 - [BranchProbability] move options for 'likely' and 'unlikely'

2021-03-20 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2021-03-20T14:46:46-04:00
New Revision: ee8b53815ddf6f6f94ade0068903cd5ae843fafa

URL: 
https://github.com/llvm/llvm-project/commit/ee8b53815ddf6f6f94ade0068903cd5ae843fafa
DIFF: 
https://github.com/llvm/llvm-project/commit/ee8b53815ddf6f6f94ade0068903cd5ae843fafa.diff

LOG: [BranchProbability] move options for 'likely' and 'unlikely'

This makes the settings available for use in other passes by housing
them within the Support lib, but NFC otherwise.

See D98898 for the proposed usage in SimplifyCFG
(where this change was originally included).

Differential Revision: https://reviews.llvm.org/D98945

Added: 


Modified: 
clang/lib/CodeGen/CodeGenFunction.cpp
llvm/include/llvm/Support/BranchProbability.h
llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h
llvm/lib/Support/BranchProbability.cpp
llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a00ae74fa165..18927b46958c 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -42,8 +42,8 @@
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/MDBuilder.h"
 #include "llvm/IR/Operator.h"
+#include "llvm/Support/BranchProbability.h"
 #include "llvm/Support/CRC.h"
-#include "llvm/Transforms/Scalar/LowerExpectIntrinsic.h"
 #include "llvm/Transforms/Utils/PromoteMemToReg.h"
 using namespace clang;
 using namespace CodeGen;

diff  --git a/llvm/include/llvm/Support/BranchProbability.h 
b/llvm/include/llvm/Support/BranchProbability.h
index 6c7ad1fe2a52..f977c70221a5 100644
--- a/llvm/include/llvm/Support/BranchProbability.h
+++ b/llvm/include/llvm/Support/BranchProbability.h
@@ -13,6 +13,7 @@
 #ifndef LLVM_SUPPORT_BRANCHPROBABILITY_H
 #define LLVM_SUPPORT_BRANCHPROBABILITY_H
 
+#include "llvm/Support/CommandLine.h"
 #include "llvm/Support/DataTypes.h"
 #include 
 #include 
@@ -21,6 +22,9 @@
 
 namespace llvm {
 
+extern cl::opt LikelyBranchWeight;
+extern cl::opt UnlikelyBranchWeight;
+
 class raw_ostream;
 
 // This class represents Branch Probability as a non-negative fraction that is

diff  --git a/llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h 
b/llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h
index 22b2e649e4d4..4e47ff70d557 100644
--- a/llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h
+++ b/llvm/include/llvm/Transforms/Scalar/LowerExpectIntrinsic.h
@@ -17,7 +17,6 @@
 
 #include "llvm/IR/Function.h"
 #include "llvm/IR/PassManager.h"
-#include "llvm/Support/CommandLine.h"
 
 namespace llvm {
 
@@ -32,8 +31,6 @@ struct LowerExpectIntrinsicPass : 
PassInfoMixin {
   PreservedAnalyses run(Function &F, FunctionAnalysisManager &);
 };
 
-extern cl::opt LikelyBranchWeight;
-extern cl::opt UnlikelyBranchWeight;
 }
 
 #endif

diff  --git a/llvm/lib/Support/BranchProbability.cpp 
b/llvm/lib/Support/BranchProbability.cpp
index 60d5478a9052..d93d9cffb9f7 100644
--- a/llvm/lib/Support/BranchProbability.cpp
+++ b/llvm/lib/Support/BranchProbability.cpp
@@ -19,6 +19,20 @@
 
 using namespace llvm;
 
+// These default values are chosen to represent an extremely skewed outcome for
+// a condition, but they leave some room for interpretation by later passes.
+//
+// If the documentation for __builtin_expect() was made explicit that it should
+// only be used in extreme cases, we could make this ratio higher. As it 
stands,
+// programmers may be using __builtin_expect() / llvm.expect to annotate that a
+// branch is only mildly likely or unlikely to be taken.
+cl::opt llvm::LikelyBranchWeight(
+"likely-branch-weight", cl::Hidden, cl::init(2000),
+cl::desc("Weight of the branch likely to be taken (default = 2000)"));
+cl::opt llvm::UnlikelyBranchWeight(
+"unlikely-branch-weight", cl::Hidden, cl::init(1),
+cl::desc("Weight of the branch unlikely to be taken (default = 1)"));
+
 constexpr uint32_t BranchProbability::D;
 
 raw_ostream &BranchProbability::print(raw_ostream &OS) const {

diff  --git a/llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp 
b/llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
index da13075dfee2..d862fcfe8ce5 100644
--- a/llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
+++ b/llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp
@@ -24,6 +24,7 @@
 #include "llvm/IR/Metadata.h"
 #include "llvm/InitializePasses.h"
 #include "llvm/Pass.h"
+#include "llvm/Support/BranchProbability.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Transforms/Scalar.h"
 
@@ -34,25 +35,6 @@ using namespace llvm;
 STATISTIC(ExpectIntrinsicsHandled,
   "Number of 'expect' intrinsic instructions handled");
 
-// These default values are chosen to represent an extremely skewed outcome for
-// a condition, but they leave some room for interpretation by later passes.
-//
-// If the documentation for __builtin_expect() was made explicit 

[clang] 57cdc52 - Initial support for vectorization using Libmvec (GLIBC vector math library)

2020-10-22 Thread Sanjay Patel via cfe-commits

Author: Venkataramanan Kumar
Date: 2020-10-22T16:01:39-04:00
New Revision: 57cdc52c4df0a8a6835ddeede787b23c0ce9e358

URL: 
https://github.com/llvm/llvm-project/commit/57cdc52c4df0a8a6835ddeede787b23c0ce9e358
DIFF: 
https://github.com/llvm/llvm-project/commit/57cdc52c4df0a8a6835ddeede787b23c0ce9e358.diff

LOG: Initial support for vectorization using Libmvec (GLIBC vector math library)

Differential Revision: https://reviews.llvm.org/D88154

Added: 
llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-VF2-VF8.ll
llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-finite.ll
llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll

Modified: 
clang/include/clang/Basic/CodeGenOptions.def
clang/include/clang/Basic/CodeGenOptions.h
clang/include/clang/Driver/Options.td
clang/lib/CodeGen/BackendUtil.cpp
clang/lib/Frontend/CompilerInvocation.cpp
clang/test/Driver/autocomplete.c
clang/test/Driver/fveclib.c
llvm/include/llvm/Analysis/TargetLibraryInfo.h
llvm/include/llvm/Analysis/VecFuncs.def
llvm/lib/Analysis/TargetLibraryInfo.cpp
llvm/test/Transforms/Util/add-TLI-mappings.ll

Removed: 




diff  --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 0ab9054f0bb5..f5222b50fc7b 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -348,7 +348,7 @@ CODEGENOPT(CodeViewGHash, 1, 0)
 ENUM_CODEGENOPT(Inlining, InliningMethod, 2, NormalInlining)
 
 // Vector functions library to use.
-ENUM_CODEGENOPT(VecLib, VectorLibrary, 2, NoLibrary)
+ENUM_CODEGENOPT(VecLib, VectorLibrary, 3, NoLibrary)
 
 /// The default TLS model to use.
 ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel)

diff  --git a/clang/include/clang/Basic/CodeGenOptions.h 
b/clang/include/clang/Basic/CodeGenOptions.h
index f658c9b8a781..764d0a17cb72 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -54,11 +54,11 @@ class CodeGenOptions : public CodeGenOptionsBase {
   enum VectorLibrary {
 NoLibrary,  // Don't use any vector library.
 Accelerate, // Use the Accelerate framework.
+LIBMVEC,// GLIBC vector math library.
 MASSV,  // IBM MASS vector library.
 SVML// Intel short vector math library.
   };
 
-
   enum ObjCDispatchMethodKind {
 Legacy = 0,
 NonLegacy = 1,

diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1d930dbed9c4..0cab3e8ecc16 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1582,7 +1582,7 @@ def fno_experimental_new_pass_manager : Flag<["-"], 
"fno-experimental-new-pass-m
   Group, Flags<[CC1Option]>,
   HelpText<"Disables an experimental new pass manager in LLVM.">;
 def fveclib : Joined<["-"], "fveclib=">, Group, Flags<[CC1Option]>,
-HelpText<"Use the given vector functions library">, 
Values<"Accelerate,MASSV,SVML,none">;
+HelpText<"Use the given vector functions library">, 
Values<"Accelerate,libmvec,MASSV,SVML,none">;
 def fno_lax_vector_conversions : Flag<["-"], "fno-lax-vector-conversions">, 
Group,
   Alias, AliasArgs<["none"]>;
 def fno_merge_all_constants : Flag<["-"], "fno-merge-all-constants">, 
Group,

diff  --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index b9c90eee64ed..0991582005b8 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -371,6 +371,16 @@ static TargetLibraryInfoImpl *createTLII(llvm::Triple 
&TargetTriple,
   case CodeGenOptions::Accelerate:
 
TLII->addVectorizableFunctionsFromVecLib(TargetLibraryInfoImpl::Accelerate);
 break;
+  case CodeGenOptions::LIBMVEC:
+switch(TargetTriple.getArch()) {
+  default:
+break;
+  case llvm::Triple::x86_64:
+TLII->addVectorizableFunctionsFromVecLib
+(TargetLibraryInfoImpl::LIBMVEC_X86);
+break;
+}
+break;
   case CodeGenOptions::MASSV:
 TLII->addVectorizableFunctionsFromVecLib(TargetLibraryInfoImpl::MASSV);
 break;

diff  --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index ff0df26dc1dd..5175063adb02 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -749,6 +749,8 @@ static bool ParseCodeGenArgs(CodeGenOptions &Opts, ArgList 
&Args, InputKind IK,
 StringRef Name = A->getValue();
 if (Name == "Accelerate")
   Opts.setVecLib(CodeGenOptions::Accelerate);
+else if (Name == "libmvec")
+  Opts.setVecLib(CodeGenOptions::LIBMVEC);
 else if (Name == "MASSV")
   Opts.setVecLib(CodeGenOptions::MASSV);
 else if (Name == "SVML")

diff  --git a/clang/test/Driver/autocomplete.c 
b/clang/test/Driver/autocomplete.c
index a6e7be887c8c..c9f66d

[clang] 149f5b5 - [APFloat] convert SNaN to QNaN in convert() and raise Invalid signal

2020-10-01 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-10-01T14:37:38-04:00
New Revision: 149f5b573c79eac0c519ada4d2f7c50e17796cdf

URL: 
https://github.com/llvm/llvm-project/commit/149f5b573c79eac0c519ada4d2f7c50e17796cdf
DIFF: 
https://github.com/llvm/llvm-project/commit/149f5b573c79eac0c519ada4d2f7c50e17796cdf.diff

LOG: [APFloat] convert SNaN to QNaN in convert() and raise Invalid signal

This is an alternate fix (see D87835) for a bug where a NaN constant
gets wrongly transformed into Infinity via truncation.
In this patch, we uniformly convert any SNaN to QNaN while raising
'invalid op'.
But we don't have a way to directly specify a 32-bit SNaN value in LLVM IR,
so those are always encoded/decoded by calling convert from/to 64-bit hex.

See D88664 for a clang fix needed to allow this change.

Differential Revision: https://reviews.llvm.org/D88238

Added: 


Modified: 
clang/test/CodeGen/builtin-nan-exception.c
clang/test/CodeGen/builtin-nan-legacy.c
clang/test/CodeGen/mips-unsupported-nan.c
llvm/lib/AsmParser/LLParser.cpp
llvm/lib/IR/AsmWriter.cpp
llvm/lib/Support/APFloat.cpp
llvm/test/Transforms/InstSimplify/ConstProp/cast.ll
llvm/test/Transforms/PhaseOrdering/X86/nancvt.ll
llvm/unittests/ADT/APFloatTest.cpp

Removed: 




diff  --git a/clang/test/CodeGen/builtin-nan-exception.c 
b/clang/test/CodeGen/builtin-nan-exception.c
index a0de25e52ebe6..7445411ddf89e 100644
--- a/clang/test/CodeGen/builtin-nan-exception.c
+++ b/clang/test/CodeGen/builtin-nan-exception.c
@@ -17,8 +17,12 @@ float f[] = {
 
 
 // Doubles are created and converted to floats.
+// Converting (truncating) to float quiets the NaN (sets the MSB
+// of the significand) and raises the APFloat invalidOp exception
+// but that should not cause a compilation error in the default
+// (ignore FP exceptions) mode.
 
-// CHECK: float 0x7FF8, float 0x7FF4
+// CHECK: float 0x7FF8, float 0x7FFC
 
 float converted_to_float[] = {
   __builtin_nan(""),

diff  --git a/clang/test/CodeGen/builtin-nan-legacy.c 
b/clang/test/CodeGen/builtin-nan-legacy.c
index cd0f0fd14f14c..de6c15379a4dd 100644
--- a/clang/test/CodeGen/builtin-nan-legacy.c
+++ b/clang/test/CodeGen/builtin-nan-legacy.c
@@ -1,7 +1,15 @@
 // RUN: %clang -target mipsel-unknown-linux -mnan=legacy -emit-llvm -S %s -o - 
| FileCheck %s
-// CHECK: float 0x7FF4, float 0x7FF8
+// CHECK: float 0x7FFC, float 0x7FF8
 // CHECK: double 0x7FF4, double 0x7FF8
 
+// The first line shows an unintended consequence.
+// __builtin_nan() creates a legacy QNAN double with an empty payload
+// (the first bit of the significand is clear to indicate quiet, so
+// the second bit of the payload is set to maintain NAN-ness).
+// The value is then truncated, but llvm::APFloat does not know about
+// the inverted quiet bit, so it sets the first bit on conversion
+// to indicate 'quiet' independently of the setting in clang.
+
 float f[] = {
   __builtin_nan(""),
   __builtin_nans(""),

diff  --git a/clang/test/CodeGen/mips-unsupported-nan.c 
b/clang/test/CodeGen/mips-unsupported-nan.c
index 2fd5042e92f8e..16cea3c2e7e18 100644
--- a/clang/test/CodeGen/mips-unsupported-nan.c
+++ b/clang/test/CodeGen/mips-unsupported-nan.c
@@ -39,7 +39,21 @@
 // CHECK-MIPS64: warning: ignoring '-mnan=2008' option because the 'mips64' 
architecture does not support it
 // CHECK-MIPS64R6: warning: ignoring '-mnan=legacy' option because the 
'mips64r6' architecture does not support it
 
-// CHECK-NANLEGACY: float 0x7FF4
+// This call creates a QNAN double with an empty payload.
+// The quiet bit is inverted in legacy mode: it is clear to indicate QNAN,
+// so the next highest bit is set to maintain NAN (not infinity).
+// In regular (2008) mode, the quiet bit is set to indicate QNAN.
+
+// CHECK-NANLEGACY: double 0x7FF4
+// CHECK-NAN2008: double 0x7FF8
+
+double d =  __builtin_nan("");
+
+// This call creates a QNAN double with an empty payload and then truncates.
+// llvm::APFloat does not know about the inverted quiet bit, so it sets the
+// quiet bit on conversion independently of the setting in clang.
+
+// CHECK-NANLEGACY: float 0x7FFC
 // CHECK-NAN2008: float 0x7FF8
 
 float f =  __builtin_nan("");

diff  --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 63f8531dbdced..4e1ae4faa4e19 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -5345,6 +5345,8 @@ bool LLParser::ConvertValIDToValue(Type *Ty, ValID &ID, 
Value *&V,
 // The lexer has no type info, so builds all half, bfloat, float, and 
double
 // FP constants as double.  Fix this here.  Long double does not need this.
 if (&ID.APFloatVal.getSemantics() == &APFloat::IEEEdouble()) {
+  // Check for signaling before potentially 

[clang] 686eb0d - [AST] do not error on APFloat invalidOp in default mode

2020-10-01 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-10-01T13:46:45-04:00
New Revision: 686eb0d8ded9159b090c3ef7b33a422e1f05166e

URL: 
https://github.com/llvm/llvm-project/commit/686eb0d8ded9159b090c3ef7b33a422e1f05166e
DIFF: 
https://github.com/llvm/llvm-project/commit/686eb0d8ded9159b090c3ef7b33a422e1f05166e.diff

LOG: [AST] do not error on APFloat invalidOp in default mode

If FP exceptions are ignored, we should not error out of compilation
just because APFloat indicated an exception.
This is required as a preliminary step for D88238
which changes APFloat behavior for signaling NaN convert() to set
the opInvalidOp exception status.

Currently, there is no way to trigger this error because convert()
never sets opInvalidOp. FP binops that set opInvalidOp also create
a NaN, so the path to checkFloatingPointResult() is blocked by a
different diagnostic:

  // [expr.pre]p4:
  //   If during the evaluation of an expression, the result is not
  //   mathematically defined [...], the behavior is undefined.
  // FIXME: C++ rules require us to not conform to IEEE 754 here.
  if (LHS.isNaN()) {
Info.CCEDiag(E, diag::note_constexpr_float_arithmetic) << LHS.isNaN();
return Info.noteUndefinedBehavior();
  }
  return checkFloatingPointResult(Info, E, St);

Differential Revision: https://reviews.llvm.org/D88664

Added: 


Modified: 
clang/lib/AST/ExprConstant.cpp

Removed: 




diff  --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index b17eed2dc823..4460e3a17e6d 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -2439,7 +2439,8 @@ static bool checkFloatingPointResult(EvalInfo &Info, 
const Expr *E,
 return false;
   }
 
-  if (St & APFloat::opStatus::opInvalidOp) {
+  if ((St & APFloat::opStatus::opInvalidOp) &&
+  FPO.getFPExceptionMode() != LangOptions::FPE_Ignore) {
 // There is no usefully definable result.
 Info.FFDiag(E);
 return false;



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 81921eb - [CodeGen] improve coverage for float (32-bit) type of NAN; NFC

2020-09-30 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-09-30T15:10:25-04:00
New Revision: 81921ebc430536ae5718da70a54328c790c8ae19

URL: 
https://github.com/llvm/llvm-project/commit/81921ebc430536ae5718da70a54328c790c8ae19
DIFF: 
https://github.com/llvm/llvm-project/commit/81921ebc430536ae5718da70a54328c790c8ae19.diff

LOG: [CodeGen] improve coverage for float (32-bit) type of NAN; NFC

Goes with D88238

Added: 


Modified: 
clang/test/CodeGen/builtin-nan-exception.c

Removed: 




diff  --git a/clang/test/CodeGen/builtin-nan-exception.c 
b/clang/test/CodeGen/builtin-nan-exception.c
index 2acf0c4390ec..a0de25e52ebe 100644
--- a/clang/test/CodeGen/builtin-nan-exception.c
+++ b/clang/test/CodeGen/builtin-nan-exception.c
@@ -5,18 +5,28 @@
 
 // Run a variety of targets to ensure there's no target-based 
diff erence.
 
-// The builtin always produces a 64-bit (double).
 // An SNaN with no payload is formed by setting the bit after the
 // the quiet bit (MSB of the significand).
 
 // CHECK: float 0x7FF8, float 0x7FF4
-// CHECK: double 0x7FF8, double 0x7FF4
 
 float f[] = {
+  __builtin_nanf(""),
+  __builtin_nansf(""),
+};
+
+
+// Doubles are created and converted to floats.
+
+// CHECK: float 0x7FF8, float 0x7FF4
+
+float converted_to_float[] = {
   __builtin_nan(""),
   __builtin_nans(""),
 };
 
+// CHECK: double 0x7FF8, double 0x7FF4
+
 double d[] = {
   __builtin_nan(""),
   __builtin_nans(""),



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 187686b - [CodeGen] add test for NAN creation; NFC

2020-09-30 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-09-30T13:22:12-04:00
New Revision: 187686bea3878c0bf2b150d784e7eab223434e25

URL: 
https://github.com/llvm/llvm-project/commit/187686bea3878c0bf2b150d784e7eab223434e25
DIFF: 
https://github.com/llvm/llvm-project/commit/187686bea3878c0bf2b150d784e7eab223434e25.diff

LOG: [CodeGen] add test for NAN creation; NFC

This goes with the APFloat change proposed in
D88238.
This is copied from the MIPS-specific test in
builtin-nan-legacy.c to verify that the normal
behavior is correct on other targets without the
complication of an inverted quiet bit.

Added: 
clang/test/CodeGen/builtin-nan-exception.c

Modified: 


Removed: 




diff  --git a/clang/test/CodeGen/builtin-nan-exception.c 
b/clang/test/CodeGen/builtin-nan-exception.c
new file mode 100644
index ..2acf0c4390ec
--- /dev/null
+++ b/clang/test/CodeGen/builtin-nan-exception.c
@@ -0,0 +1,23 @@
+// RUN: %clang -target aarch64 -emit-llvm -S %s -o - | FileCheck %s
+// RUN: %clang -target lanai -emit-llvm -S %s -o - | FileCheck %s
+// RUN: %clang -target riscv64 -emit-llvm -S %s -o - | FileCheck %s
+// RUN: %clang -target x86_64 -emit-llvm -S %s -o - | FileCheck %s
+
+// Run a variety of targets to ensure there's no target-based 
diff erence.
+
+// The builtin always produces a 64-bit (double).
+// An SNaN with no payload is formed by setting the bit after the
+// the quiet bit (MSB of the significand).
+
+// CHECK: float 0x7FF8, float 0x7FF4
+// CHECK: double 0x7FF8, double 0x7FF4
+
+float f[] = {
+  __builtin_nan(""),
+  __builtin_nans(""),
+};
+
+double d[] = {
+  __builtin_nan(""),
+  __builtin_nans(""),
+};



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] c0303e5 - [CodeGen] remove instnamer dependency from test file; NFC

2020-06-01 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-06-01T10:21:17-04:00
New Revision: c0303e5391f65dbad3a6f1dbfa5ac9c9a83fa6c0

URL: 
https://github.com/llvm/llvm-project/commit/c0303e5391f65dbad3a6f1dbfa5ac9c9a83fa6c0
DIFF: 
https://github.com/llvm/llvm-project/commit/c0303e5391f65dbad3a6f1dbfa5ac9c9a83fa6c0.diff

LOG: [CodeGen] remove instnamer dependency from test file; NFC

This file was originally added without instnamer at:
rL283716 / fe2b9b4fbf860e3dc7da7705f548bc8d7b6ab9c1

But that was reverted and the test file reappeared with instnamer at:
rL285688 / 62f516f5906f967179610a73e4cc1d852b908bbd

I'm not seeing any difference locally from checking nameless values,
so trying to remove a layering violation and see if that can
survive the build bots.

Added: 


Modified: 
clang/test/CodeGen/x86-inline-asm-v-constraint.c

Removed: 




diff  --git a/clang/test/CodeGen/x86-inline-asm-v-constraint.c 
b/clang/test/CodeGen/x86-inline-asm-v-constraint.c
index 215cccfa443e..b75a84d7a7bc 100644
--- a/clang/test/CodeGen/x86-inline-asm-v-constraint.c
+++ b/clang/test/CodeGen/x86-inline-asm-v-constraint.c
@@ -1,19 +1,19 @@
-// RUN: %clang_cc1 %s -triple x86_64-unknown-linux-gnu -emit-llvm -target-cpu 
x86-64 -o - |opt -instnamer -S |FileCheck %s --check-prefix SSE
-// RUN: %clang_cc1 %s -triple x86_64-unknown-linux-gnu -emit-llvm -target-cpu 
skylake -D AVX -o -|opt -instnamer -S  | FileCheck %s --check-prefixes AVX,SSE
-// RUN: %clang_cc1 %s -triple x86_64-unknown-linux-gnu -emit-llvm -target-cpu 
skylake-avx512 -D AVX512 -D AVX -o -|opt -instnamer -S  | FileCheck %s 
--check-prefixes AVX512,AVX,SSE
-// RUN: %clang_cc1 %s -triple x86_64-unknown-linux-gnu -emit-llvm -target-cpu 
knl -D AVX -D AVX512 -o - |opt -instnamer -S  | FileCheck %s --check-prefixes 
AVX512,AVX,SSE
+// RUN: %clang_cc1 %s -triple x86_64-unknown-linux-gnu -emit-llvm -target-cpu 
x86-64 -o - |FileCheck %s --check-prefix SSE
+// RUN: %clang_cc1 %s -triple x86_64-unknown-linux-gnu -emit-llvm -target-cpu 
skylake -D AVX -o - | FileCheck %s --check-prefixes AVX,SSE
+// RUN: %clang_cc1 %s -triple x86_64-unknown-linux-gnu -emit-llvm -target-cpu 
skylake-avx512 -D AVX512 -D AVX -o - | FileCheck %s --check-prefixes 
AVX512,AVX,SSE
+// RUN: %clang_cc1 %s -triple x86_64-unknown-linux-gnu -emit-llvm -target-cpu 
knl -D AVX -D AVX512 -o - | FileCheck %s --check-prefixes AVX512,AVX,SSE
 
 typedef float __m128 __attribute__ ((vector_size (16)));
 typedef float __m256 __attribute__ ((vector_size (32)));
 typedef float __m512 __attribute__ ((vector_size (64)));
 
-// SSE: call <4 x float> asm "vmovhlps $1, $2, $0", 
"=v,v,v,~{dirflag},~{fpsr},~{flags}"(i64 %tmp, <4 x float> %tmp1)
+// SSE: call <4 x float> asm "vmovhlps $1, $2, $0", 
"=v,v,v,~{dirflag},~{fpsr},~{flags}"(i64 %0, <4 x float> %1)
 __m128 testXMM(__m128 _xmm0, long _l) {
   __asm__("vmovhlps %1, %2, %0" :"=v"(_xmm0) : "v"(_l), "v"(_xmm0));
   return _xmm0;
 }
 
-// AVX: call <8 x float> asm "vmovsldup $1, $0", 
"=v,v,~{dirflag},~{fpsr},~{flags}"(<8 x float> %tmp)
+// AVX: call <8 x float> asm "vmovsldup $1, $0", 
"=v,v,~{dirflag},~{fpsr},~{flags}"(<8 x float> %0)
 __m256 testYMM(__m256 _ymm0) {
 #ifdef AVX
   __asm__("vmovsldup %1, %0" :"=v"(_ymm0) : "v"(_ymm0));
@@ -21,7 +21,7 @@ __m256 testYMM(__m256 _ymm0) {
   return _ymm0;
 }
 
-// AVX512: call <16 x float> asm "vpternlogd $$0, $1, $2, $0", 
"=v,v,v,~{dirflag},~{fpsr},~{flags}"(<16 x float> %tmp, <16 x float> %tmp1)
+// AVX512: call <16 x float> asm "vpternlogd $$0, $1, $2, $0", 
"=v,v,v,~{dirflag},~{fpsr},~{flags}"(<16 x float> %0, <16 x float> %1)
 __m512 testZMM(__m512 _zmm0, __m512 _zmm1) {
 #ifdef AVX512
   __asm__("vpternlogd $0, %1, %2, %0" :"=v"(_zmm0) : "v"(_zmm1), "v"(_zmm0));



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] e5b8772 - [utils] change default nameless value to "TMP"

2020-06-01 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-06-01T06:54:45-04:00
New Revision: e5b8772756737e41cb1e8ee1a5a33cb3d8a25be6

URL: 
https://github.com/llvm/llvm-project/commit/e5b8772756737e41cb1e8ee1a5a33cb3d8a25be6
DIFF: 
https://github.com/llvm/llvm-project/commit/e5b8772756737e41cb1e8ee1a5a33cb3d8a25be6.diff

LOG: [utils] change default nameless value to "TMP"

This is effectively reverting rGbfdc2552664d to avoid test churn
while we figure out a better way forward.

We at least salvage the warning on name conflict from that patch
though.

If we change the default string again, we may want to mass update
tests at the same time. Alternatively, we could live with the poor
naming if we change -instnamer.

This also adds a test to LLVM as suggested in the post-commit
review. There's a clang test that is also affected. That seems
like a layering violation, but I have not looked at fixing that yet.

Differential Revision: https://reviews.llvm.org/D80584

Added: 


Modified: 
clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected

clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll
llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.expected

llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/basic.ll.funcsig.expected
llvm/utils/UpdateTestChecks/common.py

Removed: 




diff  --git 
a/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected 
b/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected
index 6ea154286c15..d6ba7ae09b62 100644
--- a/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected
+++ b/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected
@@ -8,10 +8,10 @@
 // CHECK-NEXT:[[B_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:store i64 [[A:%.*]], i64* [[A_ADDR]], align 8
 // CHECK-NEXT:store i32 [[B:%.*]], i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[NAMELESS0:%.*]] = load i64, i64* [[A_ADDR]], align 8
-// CHECK-NEXT:[[NAMELESS1:%.*]] = load i32, i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[NAMELESS1]] to i64
-// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[NAMELESS0]], [[CONV]]
+// CHECK-NEXT:[[TMP0:%.*]] = load i64, i64* [[A_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[B_ADDR]], align 4
+// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[TMP0]], [[CONV]]
 // CHECK-NEXT:ret i64 [[ADD]]
 //
 long test(long a, int b) {
@@ -27,12 +27,12 @@ long test(long a, int b) {
 // CHECK-NEXT:store i64 [[A:%.*]], i64* [[A_ADDR]], align 8
 // CHECK-NEXT:store i32 [[B:%.*]], i32* [[B_ADDR]], align 4
 // CHECK-NEXT:store i32 [[C:%.*]], i32* [[C_ADDR]], align 4
-// CHECK-NEXT:[[NAMELESS0:%.*]] = load i64, i64* [[A_ADDR]], align 8
-// CHECK-NEXT:[[NAMELESS1:%.*]] = load i32, i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[NAMELESS1]] to i64
-// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[NAMELESS0]], [[CONV]]
-// CHECK-NEXT:[[NAMELESS2:%.*]] = load i32, i32* [[C_ADDR]], align 4
-// CHECK-NEXT:[[CONV1:%.*]] = sext i32 [[NAMELESS2]] to i64
+// CHECK-NEXT:[[TMP0:%.*]] = load i64, i64* [[A_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[B_ADDR]], align 4
+// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[TMP0]], [[CONV]]
+// CHECK-NEXT:[[TMP2:%.*]] = load i32, i32* [[C_ADDR]], align 4
+// CHECK-NEXT:[[CONV1:%.*]] = sext i32 [[TMP2]] to i64
 // CHECK-NEXT:[[ADD2:%.*]] = add nsw i64 [[ADD]], [[CONV1]]
 // CHECK-NEXT:ret i64 [[ADD2]]
 //

diff  --git 
a/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
 
b/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
index dbe1296182aa..005b2f242747 100644
--- 
a/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
+++ 
b/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
@@ -9,10 +9,10 @@
 // CHECK-NEXT:[[B_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:store i64 [[A]], i64* [[A_ADDR]], align 8
 // CHECK-NEXT:store i32 [[B]], i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[NAMELESS0:%.*]] = load i64, i64* [[A_ADDR]], align 8
-// CHECK-NEXT:[[NAMELESS1:%.*]] = load i32, i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[NAMELESS1]] to i64
-// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[NAMELESS0]], [[CONV]]
+// CHECK-NEXT:[[TMP0:%.*]] = load i64, i64* [[A_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[B_ADDR]], align 4
+// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[TMP0]], [[CONV]]
 // CHECK-NEXT:ret i64 [

[clang] dfbfdc9 - [utils] update expected strings in tests; NFC

2020-05-31 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-05-31T11:07:22-04:00
New Revision: dfbfdc96f9e15be40c938cde9b159afd028bf4a2

URL: 
https://github.com/llvm/llvm-project/commit/dfbfdc96f9e15be40c938cde9b159afd028bf4a2
DIFF: 
https://github.com/llvm/llvm-project/commit/dfbfdc96f9e15be40c938cde9b159afd028bf4a2.diff

LOG: [utils] update expected strings in tests; NFC

The script was changes with:
https://github.com/llvm/llvm-project/commit/bfdc2552664d6f0bb332a9c6a115877020f3c1df

Added: 


Modified: 
clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected

clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected

Removed: 




diff  --git 
a/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected 
b/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected
index d6ba7ae09b62..6ea154286c15 100644
--- a/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected
+++ b/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.expected
@@ -8,10 +8,10 @@
 // CHECK-NEXT:[[B_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:store i64 [[A:%.*]], i64* [[A_ADDR]], align 8
 // CHECK-NEXT:store i32 [[B:%.*]], i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[TMP0:%.*]] = load i64, i64* [[A_ADDR]], align 8
-// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP1]] to i64
-// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[TMP0]], [[CONV]]
+// CHECK-NEXT:[[NAMELESS0:%.*]] = load i64, i64* [[A_ADDR]], align 8
+// CHECK-NEXT:[[NAMELESS1:%.*]] = load i32, i32* [[B_ADDR]], align 4
+// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[NAMELESS1]] to i64
+// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[NAMELESS0]], [[CONV]]
 // CHECK-NEXT:ret i64 [[ADD]]
 //
 long test(long a, int b) {
@@ -27,12 +27,12 @@ long test(long a, int b) {
 // CHECK-NEXT:store i64 [[A:%.*]], i64* [[A_ADDR]], align 8
 // CHECK-NEXT:store i32 [[B:%.*]], i32* [[B_ADDR]], align 4
 // CHECK-NEXT:store i32 [[C:%.*]], i32* [[C_ADDR]], align 4
-// CHECK-NEXT:[[TMP0:%.*]] = load i64, i64* [[A_ADDR]], align 8
-// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP1]] to i64
-// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[TMP0]], [[CONV]]
-// CHECK-NEXT:[[TMP2:%.*]] = load i32, i32* [[C_ADDR]], align 4
-// CHECK-NEXT:[[CONV1:%.*]] = sext i32 [[TMP2]] to i64
+// CHECK-NEXT:[[NAMELESS0:%.*]] = load i64, i64* [[A_ADDR]], align 8
+// CHECK-NEXT:[[NAMELESS1:%.*]] = load i32, i32* [[B_ADDR]], align 4
+// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[NAMELESS1]] to i64
+// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[NAMELESS0]], [[CONV]]
+// CHECK-NEXT:[[NAMELESS2:%.*]] = load i32, i32* [[C_ADDR]], align 4
+// CHECK-NEXT:[[CONV1:%.*]] = sext i32 [[NAMELESS2]] to i64
 // CHECK-NEXT:[[ADD2:%.*]] = add nsw i64 [[ADD]], [[CONV1]]
 // CHECK-NEXT:ret i64 [[ADD2]]
 //

diff  --git 
a/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
 
b/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
index 005b2f242747..dbe1296182aa 100644
--- 
a/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
+++ 
b/clang/test/utils/update_cc_test_checks/Inputs/mangled_names.c.funcsig.expected
@@ -9,10 +9,10 @@
 // CHECK-NEXT:[[B_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:store i64 [[A]], i64* [[A_ADDR]], align 8
 // CHECK-NEXT:store i32 [[B]], i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[TMP0:%.*]] = load i64, i64* [[A_ADDR]], align 8
-// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP1]] to i64
-// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[TMP0]], [[CONV]]
+// CHECK-NEXT:[[NAMELESS0:%.*]] = load i64, i64* [[A_ADDR]], align 8
+// CHECK-NEXT:[[NAMELESS1:%.*]] = load i32, i32* [[B_ADDR]], align 4
+// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[NAMELESS1]] to i64
+// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[NAMELESS0]], [[CONV]]
 // CHECK-NEXT:ret i64 [[ADD]]
 //
 long test(long a, int b) {
@@ -29,12 +29,12 @@ long test(long a, int b) {
 // CHECK-NEXT:store i64 [[A]], i64* [[A_ADDR]], align 8
 // CHECK-NEXT:store i32 [[B]], i32* [[B_ADDR]], align 4
 // CHECK-NEXT:store i32 [[C]], i32* [[C_ADDR]], align 4
-// CHECK-NEXT:[[TMP0:%.*]] = load i64, i64* [[A_ADDR]], align 8
-// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[B_ADDR]], align 4
-// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP1]] to i64
-// CHECK-NEXT:[[ADD:%.*]] = add nsw i64 [[TMP0]], [[CONV]]
-// CHECK-NEXT:[[TMP2:%.*]] = load i32, i32* [[C_ADDR]], align 4
-// CHECK-NEXT:[[CONV1:%.*]] = sext i32 [[TMP2]] to i64
+// CHECK-NEXT:[[NAMELESS0:%.*]] = load i64, i64* [[A_ADDR]], align 8
+// CHECK-NEXT:[[NAMELESS1:%.*]] 

[clang] d02b3ab - [CodeGen] fix test to be (mostly) independent of LLVM optimizer; NFC

2020-05-10 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-05-10T11:25:37-04:00
New Revision: d02b3aba37d9a18691669392ff26ec28b51741f5

URL: 
https://github.com/llvm/llvm-project/commit/d02b3aba37d9a18691669392ff26ec28b51741f5
DIFF: 
https://github.com/llvm/llvm-project/commit/d02b3aba37d9a18691669392ff26ec28b51741f5.diff

LOG: [CodeGen] fix test to be (mostly) independent of LLVM optimizer; NFC

This test would break with the proposed change to IR canonicalization
in D79171.

The test tried to do the right thing by only using -mem2reg with opt,
but it was using -O3 before that step, so the opt part was meaningless.

Added: 


Modified: 
clang/test/CodeGen/arm-mve-intrinsics/cplusplus.cpp

Removed: 




diff  --git a/clang/test/CodeGen/arm-mve-intrinsics/cplusplus.cpp 
b/clang/test/CodeGen/arm-mve-intrinsics/cplusplus.cpp
index f0455eb31e84..77862b9f49cf 100644
--- a/clang/test/CodeGen/arm-mve-intrinsics/cplusplus.cpp
+++ b/clang/test/CodeGen/arm-mve-intrinsics/cplusplus.cpp
@@ -1,6 +1,6 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
-// RUN: %clang_cc1 -triple thumbv8.1m.main-none-none-eabi -target-feature 
+mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O3 
-disable-O0-optnone -S -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
-// RUN: %clang_cc1 -triple thumbv8.1m.main-none-none-eabi -target-feature 
+mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O3 
-disable-O0-optnone -DPOLYMORPHIC -S -emit-llvm -o - %s | opt -S -mem2reg | 
FileCheck %s
+// RUN: %clang_cc1 -triple thumbv8.1m.main-none-none-eabi -target-feature 
+mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -disable-O0-optnone 
-S -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
+// RUN: %clang_cc1 -triple thumbv8.1m.main-none-none-eabi -target-feature 
+mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -disable-O0-optnone 
-DPOLYMORPHIC -S -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s
 
 #include 
 
@@ -63,7 +63,7 @@ uint16x8_t test_vorrq_n_u16(uint16x8_t a)
 // CHECK-LABEL: @_Z16test_vcmpeqq_f1619__simd128_float16_tS_(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = fcmp oeq <8 x half> [[A:%.*]], [[B:%.*]]
-// CHECK-NEXT:[[TMP1:%.*]] = tail call i32 @llvm.arm.mve.pred.v2i.v8i1(<8 
x i1> [[TMP0]]), !range !3
+// CHECK-NEXT:[[TMP1:%.*]] = call i32 @llvm.arm.mve.pred.v2i.v8i1(<8 x i1> 
[[TMP0]])
 // CHECK-NEXT:[[TMP2:%.*]] = trunc i32 [[TMP1]] to i16
 // CHECK-NEXT:ret i16 [[TMP2]]
 //
@@ -78,13 +78,17 @@ mve_pred16_t test_vcmpeqq_f16(float16x8_t a, float16x8_t b)
 
 // CHECK-LABEL: @_Z18test_vcmpeqq_n_f1619__simd128_float16_tDh(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = bitcast float [[B_COERCE:%.*]] to i32
-// CHECK-NEXT:[[TMP_0_EXTRACT_TRUNC:%.*]] = trunc i32 [[TMP0]] to i16
-// CHECK-NEXT:[[TMP1:%.*]] = bitcast i16 [[TMP_0_EXTRACT_TRUNC]] to half
-// CHECK-NEXT:[[DOTSPLATINSERT:%.*]] = insertelement <8 x half> undef, 
half [[TMP1]], i32 0
+// CHECK-NEXT:[[B:%.*]] = alloca half, align 2
+// CHECK-NEXT:[[TMP:%.*]] = alloca float, align 4
+// CHECK-NEXT:store float [[B_COERCE:%.*]], float* [[TMP]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast half* [[B]] to i8*
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast float* [[TMP]] to i8*
+// CHECK-NEXT:call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 2 [[TMP0]], 
i8* align 4 [[TMP1]], i32 2, i1 false)
+// CHECK-NEXT:[[B1:%.*]] = load half, half* [[B]], align 2
+// CHECK-NEXT:[[DOTSPLATINSERT:%.*]] = insertelement <8 x half> undef, 
half [[B1]], i32 0
 // CHECK-NEXT:[[DOTSPLAT:%.*]] = shufflevector <8 x half> 
[[DOTSPLATINSERT]], <8 x half> undef, <8 x i32> zeroinitializer
-// CHECK-NEXT:[[TMP2:%.*]] = fcmp oeq <8 x half> [[DOTSPLAT]], [[A:%.*]]
-// CHECK-NEXT:[[TMP3:%.*]] = tail call i32 @llvm.arm.mve.pred.v2i.v8i1(<8 
x i1> [[TMP2]]), !range !3
+// CHECK-NEXT:[[TMP2:%.*]] = fcmp oeq <8 x half> [[A:%.*]], [[DOTSPLAT]]
+// CHECK-NEXT:[[TMP3:%.*]] = call i32 @llvm.arm.mve.pred.v2i.v8i1(<8 x i1> 
[[TMP2]])
 // CHECK-NEXT:[[TMP4:%.*]] = trunc i32 [[TMP3]] to i16
 // CHECK-NEXT:ret i16 [[TMP4]]
 //
@@ -116,8 +120,8 @@ uint16x8_t test_vld1q_u16(const uint16_t *base)
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast i32* [[BASE:%.*]] to <4 x i32>*
 // CHECK-NEXT:[[TMP1:%.*]] = zext i16 [[P:%.*]] to i32
-// CHECK-NEXT:[[TMP2:%.*]] = tail call <4 x i1> 
@llvm.arm.mve.pred.i2v.v4i1(i32 [[TMP1]])
-// CHECK-NEXT:tail call void @llvm.masked.store.v4i32.p0v4i32(<4 x i32> 
[[VALUE:%.*]], <4 x i32>* [[TMP0]], i32 4, <4 x i1> [[TMP2]])
+// CHECK-NEXT:[[TMP2:%.*]] = call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 
[[TMP1]])
+// CHECK-NEXT:call void @llvm.masked.store.v4i32.p0v4i32(<4 x i32> 
[[VALUE:%.*]], <4 x i32>* [[TMP0]], i32 4, <4 x i1> [[TMP2]])
 // CHECK-NEXT:ret void
 //
 void test_vst1q_p_s32(int32_t *base, int32x4_t v

[clang] bcc5ed7 - [CodeGen] fix test to be (mostly) independent of LLVM optimizer; NFC

2020-05-10 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-05-10T11:19:43-04:00
New Revision: bcc5ed7b24e921c8902d0d0db614576bd249f128

URL: 
https://github.com/llvm/llvm-project/commit/bcc5ed7b24e921c8902d0d0db614576bd249f128
DIFF: 
https://github.com/llvm/llvm-project/commit/bcc5ed7b24e921c8902d0d0db614576bd249f128.diff

LOG: [CodeGen] fix test to be (mostly) independent of LLVM optimizer; NFC

This test would break with the proposed change to IR canonicalization
in D79171. The raw unoptimized IR from clang is massive, so I've
replaced -instcombine with -mem2reg to make it more manageable,
but still be unlikely to break with unrelated changed to optimization.

Added: 


Modified: 
clang/test/CodeGen/aarch64-neon-fp16fml.c

Removed: 




diff  --git a/clang/test/CodeGen/aarch64-neon-fp16fml.c 
b/clang/test/CodeGen/aarch64-neon-fp16fml.c
index 3436d8b212ef..3a96692edc88 100644
--- a/clang/test/CodeGen/aarch64-neon-fp16fml.c
+++ b/clang/test/CodeGen/aarch64-neon-fp16fml.c
@@ -1,5 +1,6 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple arm64-none-linux-gnu -target-feature +v8.2a 
-target-feature +neon -target-feature +fp16fml \
-// RUN: -fallow-half-arguments-and-returns -disable-O0-optnone -emit-llvm -o - 
%s | opt -S -instcombine | FileCheck %s
+// RUN: -fallow-half-arguments-and-returns -disable-O0-optnone -emit-llvm -o - 
%s | opt -S -mem2reg | FileCheck %s
 
 // REQUIRES: aarch64-registered-target
 
@@ -9,188 +10,1252 @@
 
 // Vector form
 
+// CHECK-LABEL: @test_vfmlal_low_f16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x float> [[A:%.*]] to <8 x i8>
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <4 x half> [[B:%.*]] to <8 x i8>
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast <4 x half> [[C:%.*]] to <8 x i8>
+// CHECK-NEXT:[[VFMLAL_LOW3_I:%.*]] = call <2 x float> 
@llvm.aarch64.neon.fmlal.v2f32.v4f16(<2 x float> [[A]], <4 x half> [[B]], <4 x 
half> [[C]]) #3
+// CHECK-NEXT:ret <2 x float> [[VFMLAL_LOW3_I]]
+//
 float32x2_t test_vfmlal_low_f16(float32x2_t a, float16x4_t b, float16x4_t c) {
-// CHECK-LABEL: define <2 x float> @test_vfmlal_low_f16(<2 x float> %a, <4 x 
half> %b, <4 x half> %c)
-// CHECK: [[RESULT:%.*]] = call <2 x float> 
@llvm.aarch64.neon.fmlal.v2f32.v4f16(<2 x float> %a, <4 x half> %b, <4 x half> 
%c)
-// CHECK: ret <2 x float> [[RESULT]]
   return vfmlal_low_f16(a, b, c);
 }
 
+// CHECK-LABEL: @test_vfmlsl_low_f16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x float> [[A:%.*]] to <8 x i8>
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <4 x half> [[B:%.*]] to <8 x i8>
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast <4 x half> [[C:%.*]] to <8 x i8>
+// CHECK-NEXT:[[VFMLSL_LOW3_I:%.*]] = call <2 x float> 
@llvm.aarch64.neon.fmlsl.v2f32.v4f16(<2 x float> [[A]], <4 x half> [[B]], <4 x 
half> [[C]]) #3
+// CHECK-NEXT:ret <2 x float> [[VFMLSL_LOW3_I]]
+//
 float32x2_t test_vfmlsl_low_f16(float32x2_t a, float16x4_t b, float16x4_t c) {
-// CHECK-LABEL: define <2 x float> @test_vfmlsl_low_f16(<2 x float> %a, <4 x 
half> %b, <4 x half> %c)
-// CHECK: [[RESULT:%.*]] = call <2 x float> 
@llvm.aarch64.neon.fmlsl.v2f32.v4f16(<2 x float> %a, <4 x half> %b, <4 x half> 
%c)
-// CHECK: ret <2 x float> [[RESULT]]
   return vfmlsl_low_f16(a, b, c);
 }
 
+// CHECK-LABEL: @test_vfmlal_high_f16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x float> [[A:%.*]] to <8 x i8>
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <4 x half> [[B:%.*]] to <8 x i8>
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast <4 x half> [[C:%.*]] to <8 x i8>
+// CHECK-NEXT:[[VFMLAL_HIGH3_I:%.*]] = call <2 x float> 
@llvm.aarch64.neon.fmlal2.v2f32.v4f16(<2 x float> [[A]], <4 x half> [[B]], <4 x 
half> [[C]]) #3
+// CHECK-NEXT:ret <2 x float> [[VFMLAL_HIGH3_I]]
+//
 float32x2_t test_vfmlal_high_f16(float32x2_t a, float16x4_t b, float16x4_t c) {
-// CHECK-LABEL: define <2 x float> @test_vfmlal_high_f16(<2 x float> %a, <4 x 
half> %b, <4 x half> %c)
-// CHECK: [[RESULT:%.*]] = call <2 x float> 
@llvm.aarch64.neon.fmlal2.v2f32.v4f16(<2 x float> %a, <4 x half> %b, <4 x half> 
%c)
-// CHECK: ret <2 x float> [[RESULT]]
   return vfmlal_high_f16(a, b, c);
 }
 
+// CHECK-LABEL: @test_vfmlsl_high_f16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x float> [[A:%.*]] to <8 x i8>
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast <4 x half> [[B:%.*]] to <8 x i8>
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast <4 x half> [[C:%.*]] to <8 x i8>
+// CHECK-NEXT:[[VFMLSL_HIGH3_I:%.*]] = call <2 x float> 
@llvm.aarch64.neon.fmlsl2.v2f32.v4f16(<2 x float> [[A]], <4 x half> [[B]], <4 x 
half> [[C]]) #3
+// CHECK-NEXT:ret <2 x float> [[VFMLSL_HIGH3_I]]
+//
 float32x2_t test_vfmlsl_high_f16(float32x2_t a, float16x4_t b, float16x4_t c) {
-// CHECK-LABEL: define <2 x float> @test_vfmlsl_high_f16(<2 x float> %a, <4 x 
half> %b, <4 x half> %c)
-// CHECK: [[RESULT:

Re: [clang] 83f4372 - [CodeGen] fix clang test that runs the optimizer pipeline; NFC

2020-03-02 Thread Sanjay Patel via cfe-commits
https://reviews.llvm.org/rG8cdcbcaa02e7
https://reviews.llvm.org/rG1e308452bf68

On Thu, Feb 27, 2020 at 6:29 PM Eric Christopher  wrote:

> Sure. That sounds great. Thanks!
>
> On Wed, Feb 26, 2020 at 10:45 AM Sanjay Patel 
> wrote:
>
>> To be clear - the test is checking IR instructions, but it's checking -O1
>> IR for various targets.
>> So there must be different expectations per target...
>> But I just tried a test of turning everything down to -O0, and it all
>> passed except for the "fast-math" run for AArch64.
>> I can tweak that to not be so specific if that sounds like a reasonable
>> solution.
>>
>> On Wed, Feb 26, 2020 at 1:05 PM Eric Christopher 
>> wrote:
>>
>>> I mean anything that's testing assembly output out of clang is less than
>>> ideal. There are some circumstances, but this doesn't seem like one of
>>> them.
>>>
>>> On Wed, Feb 26, 2020, 9:10 AM Sanjay Patel 
>>> wrote:
>>>
>>>> The test file dates back to:
>>>> https://reviews.llvm.org/D5698
>>>> ...and I'm not familiar with _Complex enough to say how to fix this
>>>> properly (seems like the check lines are already limited such that -O0
>>>> rather than -O1 would work?).
>>>>
>>>> But this file keeps wiggling unexpectedly, it's going to move again
>>>> with https://reviews.llvm.org/D75130
>>>>
>>>> On Tue, Feb 25, 2020 at 1:15 PM Eric Christopher 
>>>> wrote:
>>>>
>>>>> Is there any way to pull this test out of clang and as an opt test?
>>>>> What's it trying to test?
>>>>>
>>>>> -eric
>>>>>
>>>>> On Tue, Feb 25, 2020 at 6:15 AM Sanjay Patel via cfe-commits <
>>>>> cfe-commits@lists.llvm.org> wrote:
>>>>>
>>>>>>
>>>>>> Author: Sanjay Patel
>>>>>> Date: 2020-02-25T09:13:49-05:00
>>>>>> New Revision: 83f4372f3a708ceaa800feff8b1bd92ae2c3be5f
>>>>>>
>>>>>> URL:
>>>>>> https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f
>>>>>> DIFF:
>>>>>> https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f.diff
>>>>>>
>>>>>> LOG: [CodeGen] fix clang test that runs the optimizer pipeline; NFC
>>>>>>
>>>>>> There's already a FIXME note on this file; it can break when the
>>>>>> underlying LLVM behavior changes independently of anything in clang.
>>>>>>
>>>>>> Added:
>>>>>>
>>>>>>
>>>>>> Modified:
>>>>>> clang/test/CodeGen/complex-math.c
>>>>>>
>>>>>> Removed:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 
>>>>>> diff  --git a/clang/test/CodeGen/complex-math.c
>>>>>> b/clang/test/CodeGen/complex-math.c
>>>>>> index e42418ad72c2..54dee473a364 100644
>>>>>> --- a/clang/test/CodeGen/complex-math.c
>>>>>> +++ b/clang/test/CodeGen/complex-math.c
>>>>>> @@ -93,14 +93,15 @@ float _Complex mul_float_rc(float a, float
>>>>>> _Complex b) {
>>>>>>// X86: ret
>>>>>>return a * b;
>>>>>>  }
>>>>>> +
>>>>>>  float _Complex mul_float_cc(float _Complex a, float _Complex b) {
>>>>>>// X86-LABEL: @mul_float_cc(
>>>>>>// X86: %[[AC:[^ ]+]] = fmul
>>>>>>// X86: %[[BD:[^ ]+]] = fmul
>>>>>>// X86: %[[AD:[^ ]+]] = fmul
>>>>>>// X86: %[[BC:[^ ]+]] = fmul
>>>>>> -  // X86: %[[RR:[^ ]+]] = fsub float %[[AC]], %[[BD]]
>>>>>> -  // X86: %[[RI:[^ ]+]] = fadd float
>>>>>> +  // X86: %[[RR:[^ ]+]] = fsub
>>>>>> +  // X86: %[[RI:[^ ]+]] = fadd
>>>>>>// X86-DAG: %[[AD]]
>>>>>>// X86-DAG: ,
>>>>>>// X86-DAG: %[[BC]]
>>>>>>
>>>>>>
>>>>>>
>>>>>> ___
>>>>>> cfe-commits mailing list
>>>>>> cfe-commits@lists.llvm.org
>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>>>>>
>>>>>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 1e30845 - [CodeGen] avoid running the entire optimizer pipeline in clang test file; NFC

2020-03-02 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-03-02T09:47:32-05:00
New Revision: 1e308452bf68b9576a76004de28307f0318ef9eb

URL: 
https://github.com/llvm/llvm-project/commit/1e308452bf68b9576a76004de28307f0318ef9eb
DIFF: 
https://github.com/llvm/llvm-project/commit/1e308452bf68b9576a76004de28307f0318ef9eb.diff

LOG: [CodeGen] avoid running the entire optimizer pipeline in clang test file; 
NFC

I'm making the CHECK lines vague enough that they pass at -O0.
If that is too vague (we really want to check the data flow
to verify that the variables are not mismatched, etc), then
we can adjust those lines again to more closely match the output
at -O0 rather than -O1.

This change is based on the post-commit comments for:
https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20200224/307888.html

Added: 


Modified: 
clang/test/CodeGen/complex-math.c

Removed: 




diff  --git a/clang/test/CodeGen/complex-math.c 
b/clang/test/CodeGen/complex-math.c
index 22a9e287f07f..4d3869b085c6 100644
--- a/clang/test/CodeGen/complex-math.c
+++ b/clang/test/CodeGen/complex-math.c
@@ -1,5 +1,3 @@
-// FIXME: This file should not be using -O1; that makes it depend on the 
entire LLVM IR optimizer.
-
 // RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple x86_64-unknown-unknown -o - | FileCheck %s --check-prefix=X86
 // RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple x86_64-pc-win64 -o - | FileCheck %s --check-prefix=X86
 // RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple i686-unknown-unknown -o - | FileCheck %s --check-prefix=X86
@@ -7,7 +5,7 @@
 // RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple armv7-none-linux-gnueabi -o - | FileCheck %s --check-prefix=ARM
 // RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple armv7-none-linux-gnueabihf -o - | FileCheck %s --check-prefix=ARMHF
 // RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple thumbv7k-apple-watchos2.0 -o - -target-abi aapcs16 | FileCheck %s 
--check-prefix=ARM7K
-// RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple aarch64-unknown-unknown -ffast-math -o - | FileCheck %s 
--check-prefix=AARCH64-FASTMATH
+// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple aarch64-unknown-unknown -ffast-math -o - | FileCheck %s 
--check-prefix=AARCH64-FASTMATH
 
 float _Complex add_float_rr(float a, float b) {
   // X86-LABEL: @add_float_rr(
@@ -137,23 +135,20 @@ float _Complex div_float_rc(float a, float _Complex b) {
   // AARCH64-FASTMATH-LABEL: @div_float_rc(float %a, [2 x float] %b.coerce)
   // A = a
   // B = 0
-  // AARCH64-FASTMATH: [[C:%.*]] = extractvalue [2 x float] %b.coerce, 0
-  // AARCH64-FASTMATH: [[D:%.*]] = extractvalue [2 x float] %b.coerce, 1
   //
-  // AARCH64-FASTMATH: [[AC:%.*]] = fmul fast float [[C]], %a
+  // AARCH64-FASTMATH: [[AC:%.*]] = fmul fast float
   // BD = 0
   // ACpBD = AC
   //
-  // AARCH64-FASTMATH: [[CC:%.*]] = fmul fast float [[C]], [[C]]
-  // AARCH64-FASTMATH: [[DD:%.*]] = fmul fast float [[D]], [[D]]
-  // AARCH64-FASTMATH: [[CCpDD:%.*]] = fadd fast float [[CC]], [[DD]]
+  // AARCH64-FASTMATH: [[CC:%.*]] = fmul fast float
+  // AARCH64-FASTMATH: [[DD:%.*]] = fmul fast float
+  // AARCH64-FASTMATH: [[CCpDD:%.*]] = fadd fast float
   //
   // BC = 0
-  // AARCH64-FASTMATH: [[NEGA:%.*]] = fneg fast float %a
-  // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast float  [[D]], [[NEGA]]
+  // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast float
   //
-  // AARCH64-FASTMATH: fdiv fast float [[AC]], [[CCpDD]]
-  // AARCH64-FASTMATH: fdiv fast float [[AD]], [[CCpDD]]
+  // AARCH64-FASTMATH: fdiv fast float
+  // AARCH64-FASTMATH: fdiv fast float
   // AARCH64-FASTMATH: ret
   return a / b;
 }
@@ -165,25 +160,21 @@ float _Complex div_float_cc(float _Complex a, float 
_Complex b) {
 
   // a / b = (A+iB) / (C+iD) = ((AC+BD)/(CC+DD)) + i((BC-AD)/(CC+DD))
   // AARCH64-FASTMATH-LABEL: @div_float_cc([2 x float] %a.coerce, [2 x float] 
%b.coerce)
-  // AARCH64-FASTMATH: [[A:%.*]] = extractvalue [2 x float] %a.coerce, 0
-  // AARCH64-FASTMATH: [[B:%.*]] = extractvalue [2 x float] %a.coerce, 1
-  // AARCH64-FASTMATH: [[C:%.*]] = extractvalue [2 x float] %b.coerce, 0
-  // AARCH64-FASTMATH: [[D:%.*]] = extractvalue [2 x float] %b.coerce, 1
   //
-  // AARCH64-FASTMATH: [[AC:%.*]] = fmul fast float [[C]], [[A]]
-  // AARCH64-FASTMATH: [[BD:%.*]] = fmul fast float [[D]], [[B]]
-  // AARCH64-FASTMATH: [[ACpBD:%.*]] = fadd fast float [[AC]], [[BD]]
+  // AARCH64-FASTMATH: [[AC:%.*]] = fmul fast float
+  // AARCH64-FASTMATH: [[BD:%.*]] = fmul fast float
+  // AARCH64-FASTMATH: [[ACpBD:%.*]] = fadd fast float
   //
-  // AARCH64-FASTMATH: [[CC:%.*]] = fmul fast float [[C]], [[C]]
-  // AARCH64-FA

[clang] 8cdcbca - [CodeGen] avoid running the entire optimizer pipeline in clang test file; NFC

2020-03-02 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-03-02T09:12:53-05:00
New Revision: 8cdcbcaa02e7055a6745f2c1bde003c47c91f79e

URL: 
https://github.com/llvm/llvm-project/commit/8cdcbcaa02e7055a6745f2c1bde003c47c91f79e
DIFF: 
https://github.com/llvm/llvm-project/commit/8cdcbcaa02e7055a6745f2c1bde003c47c91f79e.diff

LOG: [CodeGen] avoid running the entire optimizer pipeline in clang test file; 
NFC

There are no failures from the first set of RUN lines here,
so the CHECKs were already vague enough to not be affected
by optimizations. The final RUN line does induce some kind
of failure, so I'll try to fix that separately in a
follow-up.

Added: 


Modified: 
clang/test/CodeGen/complex-math.c

Removed: 




diff  --git a/clang/test/CodeGen/complex-math.c 
b/clang/test/CodeGen/complex-math.c
index 6f81ff2ff285..22a9e287f07f 100644
--- a/clang/test/CodeGen/complex-math.c
+++ b/clang/test/CodeGen/complex-math.c
@@ -1,12 +1,12 @@
 // FIXME: This file should not be using -O1; that makes it depend on the 
entire LLVM IR optimizer.
 
-// RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple x86_64-unknown-unknown -o - | FileCheck %s --check-prefix=X86
-// RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple x86_64-pc-win64 -o - | FileCheck %s --check-prefix=X86
-// RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple i686-unknown-unknown -o - | FileCheck %s --check-prefix=X86
-// RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple powerpc-unknown-unknown -o - | FileCheck %s --check-prefix=PPC
-// RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple armv7-none-linux-gnueabi -o - | FileCheck %s --check-prefix=ARM
-// RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple armv7-none-linux-gnueabihf -o - | FileCheck %s --check-prefix=ARMHF
-// RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple thumbv7k-apple-watchos2.0 -o - -target-abi aapcs16 | FileCheck %s 
--check-prefix=ARM7K
+// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple x86_64-unknown-unknown -o - | FileCheck %s --check-prefix=X86
+// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple x86_64-pc-win64 -o - | FileCheck %s --check-prefix=X86
+// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple i686-unknown-unknown -o - | FileCheck %s --check-prefix=X86
+// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple powerpc-unknown-unknown -o - | FileCheck %s --check-prefix=PPC
+// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple armv7-none-linux-gnueabi -o - | FileCheck %s --check-prefix=ARM
+// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple armv7-none-linux-gnueabihf -o - | FileCheck %s --check-prefix=ARMHF
+// RUN: %clang_cc1 %s -O0 -fno-experimental-new-pass-manager -emit-llvm 
-triple thumbv7k-apple-watchos2.0 -o - -target-abi aapcs16 | FileCheck %s 
--check-prefix=ARM7K
 // RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple aarch64-unknown-unknown -ffast-math -o - | FileCheck %s 
--check-prefix=AARCH64-FASTMATH
 
 float _Complex add_float_rr(float a, float b) {



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [clang] 83f4372 - [CodeGen] fix clang test that runs the optimizer pipeline; NFC

2020-02-26 Thread Sanjay Patel via cfe-commits
To be clear - the test is checking IR instructions, but it's checking -O1
IR for various targets.
So there must be different expectations per target...
But I just tried a test of turning everything down to -O0, and it all
passed except for the "fast-math" run for AArch64.
I can tweak that to not be so specific if that sounds like a reasonable
solution.

On Wed, Feb 26, 2020 at 1:05 PM Eric Christopher  wrote:

> I mean anything that's testing assembly output out of clang is less than
> ideal. There are some circumstances, but this doesn't seem like one of
> them.
>
> On Wed, Feb 26, 2020, 9:10 AM Sanjay Patel  wrote:
>
>> The test file dates back to:
>> https://reviews.llvm.org/D5698
>> ...and I'm not familiar with _Complex enough to say how to fix this
>> properly (seems like the check lines are already limited such that -O0
>> rather than -O1 would work?).
>>
>> But this file keeps wiggling unexpectedly, it's going to move again with
>> https://reviews.llvm.org/D75130
>>
>> On Tue, Feb 25, 2020 at 1:15 PM Eric Christopher 
>> wrote:
>>
>>> Is there any way to pull this test out of clang and as an opt test?
>>> What's it trying to test?
>>>
>>> -eric
>>>
>>> On Tue, Feb 25, 2020 at 6:15 AM Sanjay Patel via cfe-commits <
>>> cfe-commits@lists.llvm.org> wrote:
>>>
>>>>
>>>> Author: Sanjay Patel
>>>> Date: 2020-02-25T09:13:49-05:00
>>>> New Revision: 83f4372f3a708ceaa800feff8b1bd92ae2c3be5f
>>>>
>>>> URL:
>>>> https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f
>>>> DIFF:
>>>> https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f.diff
>>>>
>>>> LOG: [CodeGen] fix clang test that runs the optimizer pipeline; NFC
>>>>
>>>> There's already a FIXME note on this file; it can break when the
>>>> underlying LLVM behavior changes independently of anything in clang.
>>>>
>>>> Added:
>>>>
>>>>
>>>> Modified:
>>>> clang/test/CodeGen/complex-math.c
>>>>
>>>> Removed:
>>>>
>>>>
>>>>
>>>>
>>>> 
>>>> diff  --git a/clang/test/CodeGen/complex-math.c
>>>> b/clang/test/CodeGen/complex-math.c
>>>> index e42418ad72c2..54dee473a364 100644
>>>> --- a/clang/test/CodeGen/complex-math.c
>>>> +++ b/clang/test/CodeGen/complex-math.c
>>>> @@ -93,14 +93,15 @@ float _Complex mul_float_rc(float a, float _Complex
>>>> b) {
>>>>// X86: ret
>>>>return a * b;
>>>>  }
>>>> +
>>>>  float _Complex mul_float_cc(float _Complex a, float _Complex b) {
>>>>// X86-LABEL: @mul_float_cc(
>>>>// X86: %[[AC:[^ ]+]] = fmul
>>>>// X86: %[[BD:[^ ]+]] = fmul
>>>>// X86: %[[AD:[^ ]+]] = fmul
>>>>// X86: %[[BC:[^ ]+]] = fmul
>>>> -  // X86: %[[RR:[^ ]+]] = fsub float %[[AC]], %[[BD]]
>>>> -  // X86: %[[RI:[^ ]+]] = fadd float
>>>> +  // X86: %[[RR:[^ ]+]] = fsub
>>>> +  // X86: %[[RI:[^ ]+]] = fadd
>>>>// X86-DAG: %[[AD]]
>>>>// X86-DAG: ,
>>>>// X86-DAG: %[[BC]]
>>>>
>>>>
>>>>
>>>> ___
>>>> cfe-commits mailing list
>>>> cfe-commits@lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>>>
>>>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [clang] 83f4372 - [CodeGen] fix clang test that runs the optimizer pipeline; NFC

2020-02-26 Thread Sanjay Patel via cfe-commits
The test file dates back to:
https://reviews.llvm.org/D5698
...and I'm not familiar with _Complex enough to say how to fix this
properly (seems like the check lines are already limited such that -O0
rather than -O1 would work?).

But this file keeps wiggling unexpectedly, it's going to move again with
https://reviews.llvm.org/D75130

On Tue, Feb 25, 2020 at 1:15 PM Eric Christopher  wrote:

> Is there any way to pull this test out of clang and as an opt test? What's
> it trying to test?
>
> -eric
>
> On Tue, Feb 25, 2020 at 6:15 AM Sanjay Patel via cfe-commits <
> cfe-commits@lists.llvm.org> wrote:
>
>>
>> Author: Sanjay Patel
>> Date: 2020-02-25T09:13:49-05:00
>> New Revision: 83f4372f3a708ceaa800feff8b1bd92ae2c3be5f
>>
>> URL:
>> https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f
>> DIFF:
>> https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f.diff
>>
>> LOG: [CodeGen] fix clang test that runs the optimizer pipeline; NFC
>>
>> There's already a FIXME note on this file; it can break when the
>> underlying LLVM behavior changes independently of anything in clang.
>>
>> Added:
>>
>>
>> Modified:
>> clang/test/CodeGen/complex-math.c
>>
>> Removed:
>>
>>
>>
>>
>> 
>> diff  --git a/clang/test/CodeGen/complex-math.c
>> b/clang/test/CodeGen/complex-math.c
>> index e42418ad72c2..54dee473a364 100644
>> --- a/clang/test/CodeGen/complex-math.c
>> +++ b/clang/test/CodeGen/complex-math.c
>> @@ -93,14 +93,15 @@ float _Complex mul_float_rc(float a, float _Complex
>> b) {
>>// X86: ret
>>return a * b;
>>  }
>> +
>>  float _Complex mul_float_cc(float _Complex a, float _Complex b) {
>>// X86-LABEL: @mul_float_cc(
>>// X86: %[[AC:[^ ]+]] = fmul
>>// X86: %[[BD:[^ ]+]] = fmul
>>// X86: %[[AD:[^ ]+]] = fmul
>>// X86: %[[BC:[^ ]+]] = fmul
>> -  // X86: %[[RR:[^ ]+]] = fsub float %[[AC]], %[[BD]]
>> -  // X86: %[[RI:[^ ]+]] = fadd float
>> +  // X86: %[[RR:[^ ]+]] = fsub
>> +  // X86: %[[RI:[^ ]+]] = fadd
>>// X86-DAG: %[[AD]]
>>// X86-DAG: ,
>>// X86-DAG: %[[BC]]
>>
>>
>>
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 83f4372 - [CodeGen] fix clang test that runs the optimizer pipeline; NFC

2020-02-25 Thread Sanjay Patel via cfe-commits

Author: Sanjay Patel
Date: 2020-02-25T09:13:49-05:00
New Revision: 83f4372f3a708ceaa800feff8b1bd92ae2c3be5f

URL: 
https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f
DIFF: 
https://github.com/llvm/llvm-project/commit/83f4372f3a708ceaa800feff8b1bd92ae2c3be5f.diff

LOG: [CodeGen] fix clang test that runs the optimizer pipeline; NFC

There's already a FIXME note on this file; it can break when the
underlying LLVM behavior changes independently of anything in clang.

Added: 


Modified: 
clang/test/CodeGen/complex-math.c

Removed: 




diff  --git a/clang/test/CodeGen/complex-math.c 
b/clang/test/CodeGen/complex-math.c
index e42418ad72c2..54dee473a364 100644
--- a/clang/test/CodeGen/complex-math.c
+++ b/clang/test/CodeGen/complex-math.c
@@ -93,14 +93,15 @@ float _Complex mul_float_rc(float a, float _Complex b) {
   // X86: ret
   return a * b;
 }
+
 float _Complex mul_float_cc(float _Complex a, float _Complex b) {
   // X86-LABEL: @mul_float_cc(
   // X86: %[[AC:[^ ]+]] = fmul
   // X86: %[[BD:[^ ]+]] = fmul
   // X86: %[[AD:[^ ]+]] = fmul
   // X86: %[[BC:[^ ]+]] = fmul
-  // X86: %[[RR:[^ ]+]] = fsub float %[[AC]], %[[BD]]
-  // X86: %[[RI:[^ ]+]] = fadd float
+  // X86: %[[RR:[^ ]+]] = fsub
+  // X86: %[[RI:[^ ]+]] = fadd
   // X86-DAG: %[[AD]]
   // X86-DAG: ,
   // X86-DAG: %[[BC]]



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r373847 - [InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform (PR43501)

2019-10-06 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Oct  6 06:08:08 2019
New Revision: 373847

URL: http://llvm.org/viewvc/llvm-project?rev=373847&view=rev
Log:
[InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform 
(PR43501)

https://bugs.llvm.org/show_bug.cgi?id=43501
We can't declare a GEP 'inbounds' in general. But we may salvage that 
information if
we have known dereferenceable bytes on the source pointer.

Differential Revision: https://reviews.llvm.org/D68244

Modified:
cfe/trunk/test/CodeGen/aapcs-bitfield.c
cfe/trunk/test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp
cfe/trunk/test/CodeGenCXX/microsoft-abi-typeid.cpp

Modified: cfe/trunk/test/CodeGen/aapcs-bitfield.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/aapcs-bitfield.c?rev=373847&r1=373846&r2=373847&view=diff
==
--- cfe/trunk/test/CodeGen/aapcs-bitfield.c (original)
+++ cfe/trunk/test/CodeGen/aapcs-bitfield.c Sun Oct  6 06:08:08 2019
@@ -8,7 +8,7 @@ struct st0 {
 
 // LE-LABEL: @st0_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST0:%.*]], 
%struct.st0* [[M:%.*]], i32 0, i32 0
+// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST0:%.*]], %struct.st0* 
[[M:%.*]], i32 0, i32 0
 // LE-NEXT:[[BF_LOAD:%.*]] = load i8, i8* [[TMP0]], align 2
 // LE-NEXT:[[BF_SHL:%.*]] = shl i8 [[BF_LOAD]], 1
 // LE-NEXT:[[BF_ASHR:%.*]] = ashr exact i8 [[BF_SHL]], 1
@@ -17,7 +17,7 @@ struct st0 {
 //
 // BE-LABEL: @st0_check_load(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST0:%.*]], 
%struct.st0* [[M:%.*]], i32 0, i32 0
+// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST0:%.*]], %struct.st0* 
[[M:%.*]], i32 0, i32 0
 // BE-NEXT:[[BF_LOAD:%.*]] = load i8, i8* [[TMP0]], align 2
 // BE-NEXT:[[BF_ASHR:%.*]] = ashr i8 [[BF_LOAD]], 1
 // BE-NEXT:[[CONV:%.*]] = sext i8 [[BF_ASHR]] to i32
@@ -29,7 +29,7 @@ int st0_check_load(struct st0 *m) {
 
 // LE-LABEL: @st0_check_store(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST0:%.*]], 
%struct.st0* [[M:%.*]], i32 0, i32 0
+// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST0:%.*]], %struct.st0* 
[[M:%.*]], i32 0, i32 0
 // LE-NEXT:[[BF_LOAD:%.*]] = load i8, i8* [[TMP0]], align 2
 // LE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], -128
 // LE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 1
@@ -38,7 +38,7 @@ int st0_check_load(struct st0 *m) {
 //
 // BE-LABEL: @st0_check_store(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST0:%.*]], 
%struct.st0* [[M:%.*]], i32 0, i32 0
+// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST0:%.*]], %struct.st0* 
[[M:%.*]], i32 0, i32 0
 // BE-NEXT:[[BF_LOAD:%.*]] = load i8, i8* [[TMP0]], align 2
 // BE-NEXT:[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1
 // BE-NEXT:[[BF_SET:%.*]] = or i8 [[BF_CLEAR]], 2
@@ -56,7 +56,7 @@ struct st1 {
 
 // LE-LABEL: @st1_check_load(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST1:%.*]], 
%struct.st1* [[M:%.*]], i32 0, i32 0
+// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST1:%.*]], %struct.st1* 
[[M:%.*]], i32 0, i32 0
 // LE-NEXT:[[BF_LOAD:%.*]] = load i16, i16* [[TMP0]], align 4
 // LE-NEXT:[[BF_ASHR:%.*]] = ashr i16 [[BF_LOAD]], 10
 // LE-NEXT:[[CONV:%.*]] = sext i16 [[BF_ASHR]] to i32
@@ -64,7 +64,7 @@ struct st1 {
 //
 // BE-LABEL: @st1_check_load(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST1:%.*]], 
%struct.st1* [[M:%.*]], i32 0, i32 0
+// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST1:%.*]], %struct.st1* 
[[M:%.*]], i32 0, i32 0
 // BE-NEXT:[[BF_LOAD:%.*]] = load i16, i16* [[TMP0]], align 4
 // BE-NEXT:[[BF_SHL:%.*]] = shl i16 [[BF_LOAD]], 10
 // BE-NEXT:[[BF_ASHR:%.*]] = ashr exact i16 [[BF_SHL]], 10
@@ -77,7 +77,7 @@ int st1_check_load(struct st1 *m) {
 
 // LE-LABEL: @st1_check_store(
 // LE-NEXT:  entry:
-// LE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST1:%.*]], 
%struct.st1* [[M:%.*]], i32 0, i32 0
+// LE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST1:%.*]], %struct.st1* 
[[M:%.*]], i32 0, i32 0
 // LE-NEXT:[[BF_LOAD:%.*]] = load i16, i16* [[TMP0]], align 4
 // LE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], 1023
 // LE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1024
@@ -86,7 +86,7 @@ int st1_check_load(struct st1 *m) {
 //
 // BE-LABEL: @st1_check_store(
 // BE-NEXT:  entry:
-// BE-NEXT:[[TMP0:%.*]] = getelementptr inbounds [[STRUCT_ST1:%.*]], 
%struct.st1* [[M:%.*]], i32 0, i32 0
+// BE-NEXT:[[TMP0:%.*]] = getelementptr [[STRUCT_ST1:%.*]], %struct.st1* 
[[M:%.*]], i32 0, i32 0
 // BE-NEXT:[[BF_LOAD:%.*]] = load i16, i16* [[TMP0]], align 4
 // BE-NEXT:[[BF_CLEAR:%.*]] = and i16 [[BF_LOAD]], -64
 // BE-NEXT:[[BF_SET:%.*]] = or i16 [[BF_CLEAR]], 1
@@ -151,7 +151,7 @@ s

r367447 - [InstCombine] canonicalize fneg before fmul/fdiv

2019-07-31 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Wed Jul 31 09:53:22 2019
New Revision: 367447

URL: http://llvm.org/viewvc/llvm-project?rev=367447&view=rev
Log:
[InstCombine] canonicalize fneg before fmul/fdiv

Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it
easier to implement the transforms (and possibly other fneg transforms) in
1 place because we can always start the pattern match from fneg (either the
legacy binop or the new unop).

There's a secondary practical benefit seen in PR21914 and PR42681:
https://bugs.llvm.org/show_bug.cgi?id=21914
https://bugs.llvm.org/show_bug.cgi?id=42681
...hoisting fneg rather than sinking seems to play nicer with LICM in IR
(although this change may expose analysis holes in the other direction).

1. The instcombine test changes show the expected neutral IR diffs from
   reversing the order.

2. The reassociation tests show that we were missing an optimization
   opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says
   that all of these transforms are allowed (regardless of binop/unop
   fneg version) because:

   "For all other operations [besides copy/abs/negate/copysign], this
   standard does not specify the sign bit of a NaN result."
   In all of these transforms, we always have some other binop
   (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a
   potential intermediate NaN operand.
   (If that interpretation is wrong, then we must already have a bug in
   the existing transforms?)

3. The clang tests shouldn't exist as-is, but that's effectively a
   revert of rL367149 (the test broke with an extension of the
   pre-existing fneg canonicalization in rL367146).

Differential Revision: https://reviews.llvm.org/D65399

Modified:
cfe/trunk/test/CodeGen/complex-math.c

Modified: cfe/trunk/test/CodeGen/complex-math.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/complex-math.c?rev=367447&r1=367446&r2=367447&view=diff
==
--- cfe/trunk/test/CodeGen/complex-math.c (original)
+++ cfe/trunk/test/CodeGen/complex-math.c Wed Jul 31 09:53:22 2019
@@ -148,10 +148,11 @@ float _Complex div_float_rc(float a, flo
   // AARCH64-FASTMATH: [[CCpDD:%.*]] = fadd fast float [[CC]], [[DD]]
   //
   // BC = 0
-  // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast float [[D]], %a
-  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fdiv fast float [[AC]], [[CCpDD]]
-  // AARCH64-FASTMATH: [[DIV:%.*]] = fdiv fast float [[AD]], [[CCpDD]]
-  // AARCH64-FASTMATH: fsub fast float -0.00e+00, [[DIV]]
+  // AARCH64-FASTMATH: [[NEGA:%.*]] = fsub fast float -0.00e+00, %a
+  // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast float  [[D]], [[NEGA]]
+  //
+  // AARCH64-FASTMATH: fdiv fast float [[AC]], [[CCpDD]]
+  // AARCH64-FASTMATH: fdiv fast float [[AD]], [[CCpDD]]
   // AARCH64-FASTMATH: ret
   return a / b;
 }
@@ -325,10 +326,11 @@ double _Complex div_double_rc(double a,
   // AARCH64-FASTMATH: [[CCpDD:%.*]] = fadd fast double [[CC]], [[DD]]
   //
   // BC = 0
-  // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast double [[D]], %a
-  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fdiv fast double [[AC]], [[CCpDD]]
-  // AARCH64-FASTMATH: [[DIV:%.*]] = fdiv fast double [[AD]], [[CCpDD]]
-  // AARCH64-FASTMATH: fsub fast double -0.00e+00, [[DIV]]
+  // AARCH64-FASTMATH: [[NEGA:%.*]] = fsub fast double -0.00e+00, %a
+  // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast double [[D]], [[NEGA]]
+  //
+  // AARCH64-FASTMATH: fdiv fast double [[AC]], [[CCpDD]]
+  // AARCH64-FASTMATH: fdiv fast double [[AD]], [[CCpDD]]
   // AARCH64-FASTMATH: ret
   return a / b;
 }
@@ -520,10 +522,11 @@ long double _Complex div_long_double_rc(
   // AARCH64-FASTMATH: [[CCpDD:%.*]] = fadd fast fp128 [[CC]], [[DD]]
   //
   // BC = 0
-  // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast fp128 [[D]], %a
-  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fdiv fast fp128 [[AC]], [[CCpDD]]
-  // AARCH64-FASTMATH: [[DIV:%.*]] = fdiv fast fp128 [[AD]], [[CCpDD]]
-  // AARCH64-FASTMATH: fsub fast fp128 0xL8000, 
[[DIV]]
+  // AARCH64-FASTMATH: [[NEGA:%.*]] = fsub fast fp128 
0xL8000, %a
+  // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast fp128 [[D]], [[NEGA]]
+  //
+  // AARCH64-FASTMATH: fdiv fast fp128 [[AC]], [[CCpDD]]
+  // AARCH64-FASTMATH: fdiv fast fp128 [[AD]], [[CCpDD]]
   // AARCH64-FASTMATH: ret
   return a / b;
 }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r367149 - [CodeGen] fix test that broke with rL367146

2019-07-26 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Jul 26 13:36:57 2019
New Revision: 367149

URL: http://llvm.org/viewvc/llvm-project?rev=367149&view=rev
Log:
[CodeGen] fix test that broke with rL367146

This should be fixed properly to not depend on LLVM (so much).

Modified:
cfe/trunk/test/CodeGen/complex-math.c

Modified: cfe/trunk/test/CodeGen/complex-math.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/complex-math.c?rev=367149&r1=367148&r2=367149&view=diff
==
--- cfe/trunk/test/CodeGen/complex-math.c (original)
+++ cfe/trunk/test/CodeGen/complex-math.c Fri Jul 26 13:36:57 2019
@@ -1,3 +1,5 @@
+// FIXME: This file should not be using -O1; that makes it depend on the 
entire LLVM IR optimizer.
+
 // RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple x86_64-unknown-unknown -o - | FileCheck %s --check-prefix=X86
 // RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple x86_64-pc-win64 -o - | FileCheck %s --check-prefix=X86
 // RUN: %clang_cc1 %s -O1 -fno-experimental-new-pass-manager -emit-llvm 
-triple i686-unknown-unknown -o - | FileCheck %s --check-prefix=X86
@@ -147,10 +149,9 @@ float _Complex div_float_rc(float a, flo
   //
   // BC = 0
   // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast float [[D]], %a
-  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fsub fast float -0.00e+00, [[AD]]
-  //
-  // AARCH64-FASTMATH: fdiv fast float [[AC]], [[CCpDD]]
-  // AARCH64-FASTMATH: fdiv fast float [[BCmAD]], [[CCpDD]]
+  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fdiv fast float [[AC]], [[CCpDD]]
+  // AARCH64-FASTMATH: [[DIV:%.*]] = fdiv fast float [[AD]], [[CCpDD]]
+  // AARCH64-FASTMATH: fsub fast float -0.00e+00, [[DIV]]
   // AARCH64-FASTMATH: ret
   return a / b;
 }
@@ -325,10 +326,9 @@ double _Complex div_double_rc(double a,
   //
   // BC = 0
   // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast double [[D]], %a
-  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fsub fast double -0.00e+00, [[AD]]
-  //
-  // AARCH64-FASTMATH: fdiv fast double [[AC]], [[CCpDD]]
-  // AARCH64-FASTMATH: fdiv fast double [[BCmAD]], [[CCpDD]]
+  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fdiv fast double [[AC]], [[CCpDD]]
+  // AARCH64-FASTMATH: [[DIV:%.*]] = fdiv fast double [[AD]], [[CCpDD]]
+  // AARCH64-FASTMATH: fsub fast double -0.00e+00, [[DIV]]
   // AARCH64-FASTMATH: ret
   return a / b;
 }
@@ -521,10 +521,9 @@ long double _Complex div_long_double_rc(
   //
   // BC = 0
   // AARCH64-FASTMATH: [[AD:%.*]] = fmul fast fp128 [[D]], %a
-  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fsub fast fp128 
0xL8000, [[AD]]
-  //
-  // AARCH64-FASTMATH: fdiv fast fp128 [[AC]], [[CCpDD]]
-  // AARCH64-FASTMATH: fdiv fast fp128 [[BCmAD]], [[CCpDD]]
+  // AARCH64-FASTMATH: [[BCmAD:%.*]] = fdiv fast fp128 [[AC]], [[CCpDD]]
+  // AARCH64-FASTMATH: [[DIV:%.*]] = fdiv fast fp128 [[AD]], [[CCpDD]]
+  // AARCH64-FASTMATH: fsub fast fp128 0xL8000, 
[[DIV]]
   // AARCH64-FASTMATH: ret
   return a / b;
 }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r357366 - [InstCombine] canonicalize select shuffles by commuting

2019-03-31 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Mar 31 08:01:30 2019
New Revision: 357366

URL: http://llvm.org/viewvc/llvm-project?rev=357366&view=rev
Log:
[InstCombine] canonicalize select shuffles by commuting

In PR41304:
https://bugs.llvm.org/show_bug.cgi?id=41304
...we have a case where we want to fold a binop of select-shuffle (blended) 
values.

Rather than try to match commuted variants of the pattern, we can canonicalize 
the
shuffles and check for mask equality with commuted operands.

We don't produce arbitrary shuffle masks in instcombine, but select-shuffles 
are a
special case that the backend is required to handle because we already 
canonicalize
vector select to this shuffle form.

So there should be no codegen difference from this change. It's possible that 
this
improves CSE in IR though.

Differential Revision: https://reviews.llvm.org/D60016

Modified:
cfe/trunk/test/CodeGen/avx-cmp-builtins.c
cfe/trunk/test/CodeGen/avx-shuffle-builtins.c

Modified: cfe/trunk/test/CodeGen/avx-cmp-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx-cmp-builtins.c?rev=357366&r1=357365&r2=357366&view=diff
==
--- cfe/trunk/test/CodeGen/avx-cmp-builtins.c (original)
+++ cfe/trunk/test/CodeGen/avx-cmp-builtins.c Sun Mar 31 08:01:30 2019
@@ -22,25 +22,25 @@ __m128d test_cmp_ss(__m128 a, __m128 b)
 
 __m128 test_cmpgt_ss(__m128 a, __m128 b) {
   // CHECK: @llvm.x86.sse.cmp.ss({{.*}}, i8 1)
-  // CHECK: shufflevector <{{.*}}, <4 x i32> 
+  // CHECK: shufflevector <{{.*}}, <4 x i32> 
   return _mm_cmpgt_ss(a, b);
 }
 
 __m128 test_cmpge_ss(__m128 a, __m128 b) {
   // CHECK: @llvm.x86.sse.cmp.ss({{.*}}, i8 2)
-  // CHECK: shufflevector <{{.*}}, <4 x i32> 
+  // CHECK: shufflevector <{{.*}}, <4 x i32> 
   return _mm_cmpge_ss(a, b);
 }
 
 __m128 test_cmpngt_ss(__m128 a, __m128 b) {
   // CHECK: @llvm.x86.sse.cmp.ss({{.*}}, i8 5)
-  // CHECK: shufflevector <{{.*}}, <4 x i32> 
+  // CHECK: shufflevector <{{.*}}, <4 x i32> 
   return _mm_cmpngt_ss(a, b);
 }
 
 __m128 test_cmpnge_ss(__m128 a, __m128 b) {
   // CHECK: @llvm.x86.sse.cmp.ss({{.*}}, i8 6)
-  // CHECK: shufflevector <{{.*}}, <4 x i32> 
+  // CHECK: shufflevector <{{.*}}, <4 x i32> 
   return _mm_cmpnge_ss(a, b);
 }
 

Modified: cfe/trunk/test/CodeGen/avx-shuffle-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx-shuffle-builtins.c?rev=357366&r1=357365&r2=357366&view=diff
==
--- cfe/trunk/test/CodeGen/avx-shuffle-builtins.c (original)
+++ cfe/trunk/test/CodeGen/avx-shuffle-builtins.c Sun Mar 31 08:01:30 2019
@@ -91,19 +91,19 @@ test_mm256_broadcast_ss(float const *__a
 
 __m256 test_mm256_insertf128_ps_0(__m256 a, __m128 b) {
   // CHECK-LABEL: @test_mm256_insertf128_ps_0
-  // CHECK: shufflevector{{.*}}
+  // CHECK: shufflevector{{.*}}
   return _mm256_insertf128_ps(a, b, 0);
 }
 
 __m256d test_mm256_insertf128_pd_0(__m256d a, __m128d b) {
   // CHECK-LABEL: @test_mm256_insertf128_pd_0
-  // CHECK: shufflevector{{.*}}
+  // CHECK: shufflevector{{.*}}
   return _mm256_insertf128_pd(a, b, 0);
 }
 
 __m256i test_mm256_insertf128_si256_0(__m256i a, __m128i b) {
   // CHECK-LABEL: @test_mm256_insertf128_si256_0
-  // CHECK: shufflevector{{.*}}
+  // CHECK: shufflevector{{.*}}
   return _mm256_insertf128_si256(a, b, 0);
 }
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r347527 - [CodeGen] translate MS rotate builtins to LLVM funnel-shift intrinsics

2018-11-25 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Nov 25 09:53:16 2018
New Revision: 347527

URL: http://llvm.org/viewvc/llvm-project?rev=347527&view=rev
Log:
[CodeGen] translate MS rotate builtins to LLVM funnel-shift intrinsics

This was originally part of:
D50924

and should resolve PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387

...but it was reverted because some bots using a gcc host compiler 
would crash for unknown reasons with this included in the patch. 
Trying again now to see if that's still a problem.

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=347527&r1=347526&r2=347527&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Sun Nov 25 09:53:16 2018
@@ -1820,46 +1820,6 @@ RValue CodeGenFunction::EmitBuiltinExpr(
  "cast");
 return RValue::get(Result);
   }
-  case Builtin::BI_rotr8:
-  case Builtin::BI_rotr16:
-  case Builtin::BI_rotr:
-  case Builtin::BI_lrotr:
-  case Builtin::BI_rotr64: {
-Value *Val = EmitScalarExpr(E->getArg(0));
-Value *Shift = EmitScalarExpr(E->getArg(1));
-
-llvm::Type *ArgType = Val->getType();
-Shift = Builder.CreateIntCast(Shift, ArgType, false);
-unsigned ArgWidth = ArgType->getIntegerBitWidth();
-Value *Mask = llvm::ConstantInt::get(ArgType, ArgWidth - 1);
-
-Value *RightShiftAmt = Builder.CreateAnd(Shift, Mask);
-Value *RightShifted = Builder.CreateLShr(Val, RightShiftAmt);
-Value *LeftShiftAmt = Builder.CreateAnd(Builder.CreateNeg(Shift), Mask);
-Value *LeftShifted = Builder.CreateShl(Val, LeftShiftAmt);
-Value *Result = Builder.CreateOr(LeftShifted, RightShifted);
-return RValue::get(Result);
-  }
-  case Builtin::BI_rotl8:
-  case Builtin::BI_rotl16:
-  case Builtin::BI_rotl:
-  case Builtin::BI_lrotl:
-  case Builtin::BI_rotl64: {
-Value *Val = EmitScalarExpr(E->getArg(0));
-Value *Shift = EmitScalarExpr(E->getArg(1));
-
-llvm::Type *ArgType = Val->getType();
-Shift = Builder.CreateIntCast(Shift, ArgType, false);
-unsigned ArgWidth = ArgType->getIntegerBitWidth();
-Value *Mask = llvm::ConstantInt::get(ArgType, ArgWidth - 1);
-
-Value *LeftShiftAmt = Builder.CreateAnd(Shift, Mask);
-Value *LeftShifted = Builder.CreateShl(Val, LeftShiftAmt);
-Value *RightShiftAmt = Builder.CreateAnd(Builder.CreateNeg(Shift), Mask);
-Value *RightShifted = Builder.CreateLShr(Val, RightShiftAmt);
-Value *Result = Builder.CreateOr(LeftShifted, RightShifted);
-return RValue::get(Result);
-  }
   case Builtin::BI__builtin_unpredictable: {
 // Always return the argument of __builtin_unpredictable. LLVM does not
 // handle this builtin. Metadata for this builtin should be added directly
@@ -1918,12 +1878,22 @@ RValue CodeGenFunction::EmitBuiltinExpr(
   case Builtin::BI__builtin_rotateleft16:
   case Builtin::BI__builtin_rotateleft32:
   case Builtin::BI__builtin_rotateleft64:
+  case Builtin::BI_rotl8: // Microsoft variants of rotate left
+  case Builtin::BI_rotl16:
+  case Builtin::BI_rotl:
+  case Builtin::BI_lrotl:
+  case Builtin::BI_rotl64:
 return emitRotate(E, false);
 
   case Builtin::BI__builtin_rotateright8:
   case Builtin::BI__builtin_rotateright16:
   case Builtin::BI__builtin_rotateright32:
   case Builtin::BI__builtin_rotateright64:
+  case Builtin::BI_rotr8: // Microsoft variants of rotate right
+  case Builtin::BI_rotr16:
+  case Builtin::BI_rotr:
+  case Builtin::BI_lrotr:
+  case Builtin::BI_rotr64:
 return emitRotate(E, true);
 
   case Builtin::BI__builtin_constant_p: {

Modified: cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c?rev=347527&r1=347526&r2=347527&view=diff
==
--- cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c (original)
+++ cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c Sun Nov 25 09:53:16 2018
@@ -30,66 +30,36 @@ unsigned char test_rotl8(unsigned char v
   return _rotl8(value, shift);
 }
 // CHECK: i8 @test_rotl8
-// CHECK:   [[LSHIFT:%[0-9]+]] = and i8 [[SHIFT:%[0-9]+]], 7
-// CHECK:   [[HIGH:%[0-9]+]] = shl i8 [[VALUE:%[0-9]+]], [[LSHIFT]]
-// CHECK:   [[NEGATE:%[0-9]+]] = sub i8 0, [[SHIFT]]
-// CHECK:   [[RSHIFT:%[0-9]+]] = and i8 [[NEGATE]], 7
-// CHECK:   [[LOW:%[0-9]+]] = lshr i8 [[VALUE]], [[RSHIFT]]
-// CHECK:   [[RESULT:%[0-9]+]] = or i8 [[HIGH]], [[LOW]]
-// CHECK:   ret i8 [[RESULT]]
-// CHECK  }
+// CHECK:   [[R:%.*]] = call i8 @llvm.fshl.i8(i8 [[X:%.*]], i8 [[X]], i8 
[[Y:%.*]])
+// CHECK:   ret i8 [[R]]
 
 unsigned short test_rotl16(unsigned short value, unsigned char shift) {
   return _rotl16(value, shift);
 }
 // CHECK: i16 @

r340142 - [CodeGen] add test file that should have been included with r340141

2018-08-19 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Aug 19 10:32:56 2018
New Revision: 340142

URL: http://llvm.org/viewvc/llvm-project?rev=340142&view=rev
Log:
[CodeGen] add test file that should have been included with r340141


Added:
cfe/trunk/test/CodeGen/builtin-rotate.c

Added: cfe/trunk/test/CodeGen/builtin-rotate.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/builtin-rotate.c?rev=340142&view=auto
==
--- cfe/trunk/test/CodeGen/builtin-rotate.c (added)
+++ cfe/trunk/test/CodeGen/builtin-rotate.c Sun Aug 19 10:32:56 2018
@@ -0,0 +1,66 @@
+// RUN: %clang_cc1 %s -emit-llvm -o - | FileCheck %s
+
+unsigned char rotl8(unsigned char x, unsigned char y) {
+// CHECK-LABEL: rotl8
+// CHECK: [[F:%.*]] = call i8 @llvm.fshl.i8(i8 [[X:%.*]], i8 [[X]], i8 
[[Y:%.*]])
+// CHECK-NEXT: ret i8 [[F]]
+
+  return __builtin_rotateleft8(x, y);
+}
+
+short rotl16(short x, short y) {
+// CHECK-LABEL: rotl16
+// CHECK: [[F:%.*]] = call i16 @llvm.fshl.i16(i16 [[X:%.*]], i16 [[X]], i16 
[[Y:%.*]])
+// CHECK-NEXT: ret i16 [[F]]
+
+  return __builtin_rotateleft16(x, y);
+}
+
+int rotl32(int x, unsigned int y) {
+// CHECK-LABEL: rotl32
+// CHECK: [[F:%.*]] = call i32 @llvm.fshl.i32(i32 [[X:%.*]], i32 [[X]], i32 
[[Y:%.*]])
+// CHECK-NEXT: ret i32 [[F]]
+
+  return __builtin_rotateleft32(x, y);
+}
+
+unsigned long long rotl64(unsigned long long x, long long y) {
+// CHECK-LABEL: rotl64
+// CHECK: [[F:%.*]] = call i64 @llvm.fshl.i64(i64 [[X:%.*]], i64 [[X]], i64 
[[Y:%.*]])
+// CHECK-NEXT: ret i64 [[F]]
+
+  return __builtin_rotateleft64(x, y);
+}
+
+char rotr8(char x, char y) {
+// CHECK-LABEL: rotr8
+// CHECK: [[F:%.*]] = call i8 @llvm.fshr.i8(i8 [[X:%.*]], i8 [[X]], i8 
[[Y:%.*]])
+// CHECK-NEXT: ret i8 [[F]]
+
+  return __builtin_rotateright8(x, y);
+}
+
+unsigned short rotr16(unsigned short x, unsigned short y) {
+// CHECK-LABEL: rotr16
+// CHECK: [[F:%.*]] = call i16 @llvm.fshr.i16(i16 [[X:%.*]], i16 [[X]], i16 
[[Y:%.*]])
+// CHECK-NEXT: ret i16 [[F]]
+
+  return __builtin_rotateright16(x, y);
+}
+
+unsigned int rotr32(unsigned int x, int y) {
+// CHECK-LABEL: rotr32
+// CHECK: [[F:%.*]] = call i32 @llvm.fshr.i32(i32 [[X:%.*]], i32 [[X]], i32 
[[Y:%.*]])
+// CHECK-NEXT: ret i32 [[F]]
+
+  return __builtin_rotateright32(x, y);
+}
+
+long long rotr64(long long x, unsigned long long y) {
+// CHECK-LABEL: rotr64
+// CHECK: [[F:%.*]] = call i64 @llvm.fshr.i64(i64 [[X:%.*]], i64 [[X]], i64 
[[Y:%.*]])
+// CHECK-NEXT: ret i64 [[F]]
+
+  return __builtin_rotateright64(x, y);
+}
+


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r340141 - [CodeGen] add rotate builtins that map to LLVM funnel shift

2018-08-19 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Aug 19 09:50:30 2018
New Revision: 340141

URL: http://llvm.org/viewvc/llvm-project?rev=340141&view=rev
Log:
[CodeGen] add rotate builtins that map to LLVM funnel shift 
This is a partial retry of rL340137 (reverted at rL340138 because of gcc host 
compiler crashing)
with 1 change:
Remove the changes to make microsoft builtins also use the LLVM intrinsics.
 
This exposes the LLVM funnel shift intrinsics as more familiar bit rotation 
functions in clang
(when both halves of a funnel shift are the same value, it's a rotate).

We're free to name these as we want because we're not copying gcc, but if 
there's some other
existing art (eg, the microsoft ops) that we want to replicate, we can change 
the names.

The funnel shift intrinsics were added here:
https://reviews.llvm.org/D49242

With improved codegen in:
https://reviews.llvm.org/rL337966
https://reviews.llvm.org/rL339359

And basic IR optimization added in:
https://reviews.llvm.org/rL338218
https://reviews.llvm.org/rL340022

...so these are expected to produce asm output that's equal or better to the 
multi-instruction
alternatives using primitive C/IR ops.

In the motivating loop example from PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387#c7
...we get the expected 'rolq' x86 instructions if we substitute the rotate 
builtin into the source.

Differential Revision: https://reviews.llvm.org/D50924

Modified:
cfe/trunk/docs/LanguageExtensions.rst
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/lib/CodeGen/CodeGenFunction.h

Modified: cfe/trunk/docs/LanguageExtensions.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/LanguageExtensions.rst?rev=340141&r1=340140&r2=340141&view=diff
==
--- cfe/trunk/docs/LanguageExtensions.rst (original)
+++ cfe/trunk/docs/LanguageExtensions.rst Sun Aug 19 09:50:30 2018
@@ -1739,6 +1739,70 @@ The '``__builtin_bitreverse``' family of
 the bitpattern of an integer value; for example ``0b10110110`` becomes
 ``0b01101101``.
 
+``__builtin_rotateleft``
+
+
+* ``__builtin_rotateleft8``
+* ``__builtin_rotateleft16``
+* ``__builtin_rotateleft32``
+* ``__builtin_rotateleft64``
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_rotateleft32(x, y)
+
+**Examples**:
+
+.. code-block:: c++
+
+  uint8_t rot_x = __builtin_rotateleft8(x, y);
+  uint16_t rot_x = __builtin_rotateleft16(x, y);
+  uint32_t rot_x = __builtin_rotateleft32(x, y);
+  uint64_t rot_x = __builtin_rotateleft64(x, y);
+
+**Description**:
+
+The '``__builtin_rotateleft``' family of builtins is used to rotate
+the bits in the first argument by the amount in the second argument. 
+For example, ``0b1110`` rotated left by 11 becomes ``0b00110100``.
+The shift value is treated as an unsigned amount modulo the size of
+the arguments. Both arguments and the result have the bitwidth specified
+by the name of the builtin.
+
+``__builtin_rotateright``
+_
+
+* ``__builtin_rotateright8``
+* ``__builtin_rotateright16``
+* ``__builtin_rotateright32``
+* ``__builtin_rotateright64``
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_rotateright32(x, y)
+
+**Examples**:
+
+.. code-block:: c++
+
+  uint8_t rot_x = __builtin_rotateright8(x, y);
+  uint16_t rot_x = __builtin_rotateright16(x, y);
+  uint32_t rot_x = __builtin_rotateright32(x, y);
+  uint64_t rot_x = __builtin_rotateright64(x, y);
+
+**Description**:
+
+The '``__builtin_rotateright``' family of builtins is used to rotate
+the bits in the first argument by the amount in the second argument. 
+For example, ``0b1110`` rotated right by 3 becomes ``0b1101``.
+The shift value is treated as an unsigned amount modulo the size of
+the arguments. Both arguments and the result have the bitwidth specified
+by the name of the builtin.
+
 ``__builtin_unreachable``
 -
 

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=340141&r1=340140&r2=340141&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Sun Aug 19 09:50:30 2018
@@ -428,6 +428,15 @@ BUILTIN(__builtin_bitreverse16, "UsUs",
 BUILTIN(__builtin_bitreverse32, "UiUi", "nc")
 BUILTIN(__builtin_bitreverse64, "ULLiULLi", "nc")
 
+BUILTIN(__builtin_rotateleft8, "UcUcUc", "nc")
+BUILTIN(__builtin_rotateleft16, "UsUsUs", "nc")
+BUILTIN(__builtin_rotateleft32, "UiUiUi", "nc")
+BUILTIN(__builtin_rotateleft64, "ULLiULLiULLi", "nc")
+BUILTIN(__builtin_rotateright8, "UcUcUc", "nc")
+BUILTIN(__builtin_rotateright16, "UsUsUs", "nc")
+BUILTIN(__builtin_rotateright32, "UiUiUi", "nc")
+BUILTIN(__builtin_rotateright64, "ULLiULLiULLi", "nc")
+
 // Random GCC builtins
 BUILTIN(__bui

r340138 - revert r340137: [CodeGen] add rotate builtins

2018-08-19 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Aug 19 08:31:42 2018
New Revision: 340138

URL: http://llvm.org/viewvc/llvm-project?rev=340138&view=rev
Log:
revert r340137: [CodeGen] add rotate builtins

At least a couple of bots (gcc host compiler on PPC only?) are showing the 
compiler dying while trying to compile.

Removed:
cfe/trunk/test/CodeGen/builtin-rotate.c
Modified:
cfe/trunk/docs/LanguageExtensions.rst
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/lib/CodeGen/CodeGenFunction.h
cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c

Modified: cfe/trunk/docs/LanguageExtensions.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/LanguageExtensions.rst?rev=340138&r1=340137&r2=340138&view=diff
==
--- cfe/trunk/docs/LanguageExtensions.rst (original)
+++ cfe/trunk/docs/LanguageExtensions.rst Sun Aug 19 08:31:42 2018
@@ -1739,70 +1739,6 @@ The '``__builtin_bitreverse``' family of
 the bitpattern of an integer value; for example ``0b10110110`` becomes
 ``0b01101101``.
 
-``__builtin_rotateleft``
-
-
-* ``__builtin_rotateleft8``
-* ``__builtin_rotateleft16``
-* ``__builtin_rotateleft32``
-* ``__builtin_rotateleft64``
-
-**Syntax**:
-
-.. code-block:: c++
-
- __builtin_rotateleft32(x, y)
-
-**Examples**:
-
-.. code-block:: c++
-
-  uint8_t rot_x = __builtin_rotateleft8(x, y);
-  uint16_t rot_x = __builtin_rotateleft16(x, y);
-  uint32_t rot_x = __builtin_rotateleft32(x, y);
-  uint64_t rot_x = __builtin_rotateleft64(x, y);
-
-**Description**:
-
-The '``__builtin_rotateleft``' family of builtins is used to rotate
-the bits in the first argument by the amount in the second argument. 
-For example, ``0b1110`` rotated left by 11 becomes ``0b00110100``.
-The shift value is treated as an unsigned amount modulo the size of
-the arguments. Both arguments and the result have the bitwidth specified
-by the name of the builtin.
-
-``__builtin_rotateright``
-_
-
-* ``__builtin_rotateright8``
-* ``__builtin_rotateright16``
-* ``__builtin_rotateright32``
-* ``__builtin_rotateright64``
-
-**Syntax**:
-
-.. code-block:: c++
-
- __builtin_rotateright32(x, y)
-
-**Examples**:
-
-.. code-block:: c++
-
-  uint8_t rot_x = __builtin_rotateright8(x, y);
-  uint16_t rot_x = __builtin_rotateright16(x, y);
-  uint32_t rot_x = __builtin_rotateright32(x, y);
-  uint64_t rot_x = __builtin_rotateright64(x, y);
-
-**Description**:
-
-The '``__builtin_rotateright``' family of builtins is used to rotate
-the bits in the first argument by the amount in the second argument. 
-For example, ``0b1110`` rotated right by 3 becomes ``0b1101``.
-The shift value is treated as an unsigned amount modulo the size of
-the arguments. Both arguments and the result have the bitwidth specified
-by the name of the builtin.
-
 ``__builtin_unreachable``
 -
 

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=340138&r1=340137&r2=340138&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Sun Aug 19 08:31:42 2018
@@ -428,15 +428,6 @@ BUILTIN(__builtin_bitreverse16, "UsUs",
 BUILTIN(__builtin_bitreverse32, "UiUi", "nc")
 BUILTIN(__builtin_bitreverse64, "ULLiULLi", "nc")
 
-BUILTIN(__builtin_rotateleft8, "UcUcUc", "nc")
-BUILTIN(__builtin_rotateleft16, "UsUsUs", "nc")
-BUILTIN(__builtin_rotateleft32, "UiUiUi", "nc")
-BUILTIN(__builtin_rotateleft64, "ULLiULLiULLi", "nc")
-BUILTIN(__builtin_rotateright8, "UcUcUc", "nc")
-BUILTIN(__builtin_rotateright16, "UsUsUs", "nc")
-BUILTIN(__builtin_rotateright32, "UiUiUi", "nc")
-BUILTIN(__builtin_rotateright64, "ULLiULLiULLi", "nc")
-
 // Random GCC builtins
 BUILTIN(__builtin_constant_p, "i.", "nctu")
 BUILTIN(__builtin_classify_type, "i.", "nctu")

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=340138&r1=340137&r2=340138&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Sun Aug 19 08:31:42 2018
@@ -1252,21 +1252,6 @@ static llvm::Value *dumpRecord(CodeGenFu
   return Res;
 }
 
-RValue CodeGenFunction::emitRotate(const CallExpr *E, bool IsRotateRight) {
-  llvm::Value *Src = EmitScalarExpr(E->getArg(0));
-  llvm::Value *ShiftAmt = EmitScalarExpr(E->getArg(1));
-
-  // The builtin's shift arg may have a different type than the source arg and
-  // result, but the LLVM intrinsic uses the same type for all values.
-  llvm::Type *Ty = Src->getType();
-  ShiftAmt = Builder.CreateIntCast(ShiftAmt, Ty, false);
-
-  // Rotate is a special case of LLVM f

r340137 - [CodeGen] add/fix rotate builtins that map to LLVM funnel shift (retry)

2018-08-19 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Aug 19 07:44:47 2018
New Revision: 340137

URL: http://llvm.org/viewvc/llvm-project?rev=340137&view=rev
Log:
[CodeGen] add/fix rotate builtins that map to LLVM funnel shift (retry)

This is a retry of rL340135 (reverted at rL340136 because of gcc host compiler 
crashing)
with 2 changes:
1. Move the code into a helper to reduce code duplication (and hopefully 
work-around the crash).
2. The original commit had a formatting bug in the docs (missing an underscore).

Original commit message:

This exposes the LLVM funnel shift intrinsics as more familiar bit rotation 
functions in clang
(when both halves of a funnel shift are the same value, it's a rotate).

We're free to name these as we want because we're not copying gcc, but if 
there's some other
existing art (eg, the microsoft ops that are modified in this patch) that we 
want to replicate,
we can change the names.

The funnel shift intrinsics were added here:
https://reviews.llvm.org/D49242

With improved codegen in:
https://reviews.llvm.org/rL337966
https://reviews.llvm.org/rL339359

And basic IR optimization added in:
https://reviews.llvm.org/rL338218
https://reviews.llvm.org/rL340022

...so these are expected to produce asm output that's equal or better to the 
multi-instruction
alternatives using primitive C/IR ops.

In the motivating loop example from PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387#c7
...we get the expected 'rolq' x86 instructions if we substitute the rotate 
builtin into the source.

Differential Revision: https://reviews.llvm.org/D50924

Added:
cfe/trunk/test/CodeGen/builtin-rotate.c
Modified:
cfe/trunk/docs/LanguageExtensions.rst
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/lib/CodeGen/CodeGenFunction.h
cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c

Modified: cfe/trunk/docs/LanguageExtensions.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/LanguageExtensions.rst?rev=340137&r1=340136&r2=340137&view=diff
==
--- cfe/trunk/docs/LanguageExtensions.rst (original)
+++ cfe/trunk/docs/LanguageExtensions.rst Sun Aug 19 07:44:47 2018
@@ -1739,6 +1739,70 @@ The '``__builtin_bitreverse``' family of
 the bitpattern of an integer value; for example ``0b10110110`` becomes
 ``0b01101101``.
 
+``__builtin_rotateleft``
+
+
+* ``__builtin_rotateleft8``
+* ``__builtin_rotateleft16``
+* ``__builtin_rotateleft32``
+* ``__builtin_rotateleft64``
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_rotateleft32(x, y)
+
+**Examples**:
+
+.. code-block:: c++
+
+  uint8_t rot_x = __builtin_rotateleft8(x, y);
+  uint16_t rot_x = __builtin_rotateleft16(x, y);
+  uint32_t rot_x = __builtin_rotateleft32(x, y);
+  uint64_t rot_x = __builtin_rotateleft64(x, y);
+
+**Description**:
+
+The '``__builtin_rotateleft``' family of builtins is used to rotate
+the bits in the first argument by the amount in the second argument. 
+For example, ``0b1110`` rotated left by 11 becomes ``0b00110100``.
+The shift value is treated as an unsigned amount modulo the size of
+the arguments. Both arguments and the result have the bitwidth specified
+by the name of the builtin.
+
+``__builtin_rotateright``
+_
+
+* ``__builtin_rotateright8``
+* ``__builtin_rotateright16``
+* ``__builtin_rotateright32``
+* ``__builtin_rotateright64``
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_rotateright32(x, y)
+
+**Examples**:
+
+.. code-block:: c++
+
+  uint8_t rot_x = __builtin_rotateright8(x, y);
+  uint16_t rot_x = __builtin_rotateright16(x, y);
+  uint32_t rot_x = __builtin_rotateright32(x, y);
+  uint64_t rot_x = __builtin_rotateright64(x, y);
+
+**Description**:
+
+The '``__builtin_rotateright``' family of builtins is used to rotate
+the bits in the first argument by the amount in the second argument. 
+For example, ``0b1110`` rotated right by 3 becomes ``0b1101``.
+The shift value is treated as an unsigned amount modulo the size of
+the arguments. Both arguments and the result have the bitwidth specified
+by the name of the builtin.
+
 ``__builtin_unreachable``
 -
 

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=340137&r1=340136&r2=340137&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Sun Aug 19 07:44:47 2018
@@ -428,6 +428,15 @@ BUILTIN(__builtin_bitreverse16, "UsUs",
 BUILTIN(__builtin_bitreverse32, "UiUi", "nc")
 BUILTIN(__builtin_bitreverse64, "ULLiULLi", "nc")
 
+BUILTIN(__builtin_rotateleft8, "UcUcUc", "nc")
+BUILTIN(__builtin_rotateleft16, "UsUsUs", "nc")
+BUILTIN(__builtin_rotateleft32, "UiUiUi", "nc")
+BUILTIN(__builtin_rotateleft64

r340136 - revert r340135: [CodeGen] add rotate builtins

2018-08-19 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Aug 19 06:48:06 2018
New Revision: 340136

URL: http://llvm.org/viewvc/llvm-project?rev=340136&view=rev
Log:
revert r340135: [CodeGen] add rotate builtins

At least a couple of bots (PPC only?) are showing the compiler dying while 
trying to compile:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/11065/steps/build%20stage%201/logs/stdio
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/18267/steps/build%20stage%201/logs/stdio

Removed:
cfe/trunk/test/CodeGen/builtin-rotate.c
Modified:
cfe/trunk/docs/LanguageExtensions.rst
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c

Modified: cfe/trunk/docs/LanguageExtensions.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/LanguageExtensions.rst?rev=340136&r1=340135&r2=340136&view=diff
==
--- cfe/trunk/docs/LanguageExtensions.rst (original)
+++ cfe/trunk/docs/LanguageExtensions.rst Sun Aug 19 06:48:06 2018
@@ -1739,70 +1739,6 @@ The '``__builtin_bitreverse``' family of
 the bitpattern of an integer value; for example ``0b10110110`` becomes
 ``0b01101101``.
 
-``__builtin_rotateleft``
-
-
-* ``__builtin_rotateleft8``
-* ``__builtin_rotateleft16``
-* ``__builtin_rotateleft32``
-* ``__builtin_rotateleft64``
-
-**Syntax**:
-
-.. code-block:: c++
-
- __builtin_rotateleft32(x, y)
-
-**Examples**:
-
-.. code-block:: c++
-
-  uint8_t rot_x = __builtin_rotateleft8(x, y);
-  uint16_t rot_x = __builtin_rotateleft16(x, y);
-  uint32_t rot_x = __builtin_rotateleft32(x, y);
-  uint64_t rot_x = __builtin_rotateleft64(x, y);
-
-**Description**:
-
-The '``__builtin_rotateleft``' family of builtins is used to rotate
-the bits in the first argument by the amount in the second argument. 
-For example, ``0b1110`` rotated left by 11 becomes ``0b00110100``.
-The shift value is treated as an unsigned amount modulo the size of
-the arguments. Both arguments and the result have the bitwidth specified
-by the name of the builtin.
-
-``__builtin_rotateright``
-
-
-* ``__builtin_rotateright8``
-* ``__builtin_rotateright16``
-* ``__builtin_rotateright32``
-* ``__builtin_rotateright64``
-
-**Syntax**:
-
-.. code-block:: c++
-
- __builtin_rotateright32(x, y)
-
-**Examples**:
-
-.. code-block:: c++
-
-  uint8_t rot_x = __builtin_rotateright8(x, y);
-  uint16_t rot_x = __builtin_rotateright16(x, y);
-  uint32_t rot_x = __builtin_rotateright32(x, y);
-  uint64_t rot_x = __builtin_rotateright64(x, y);
-
-**Description**:
-
-The '``__builtin_rotateright``' family of builtins is used to rotate
-the bits in the first argument by the amount in the second argument. 
-For example, ``0b1110`` rotated right by 3 becomes ``0b1101``.
-The shift value is treated as an unsigned amount modulo the size of
-the arguments. Both arguments and the result have the bitwidth specified
-by the name of the builtin.
-
 ``__builtin_unreachable``
 -
 

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=340136&r1=340135&r2=340136&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Sun Aug 19 06:48:06 2018
@@ -428,15 +428,6 @@ BUILTIN(__builtin_bitreverse16, "UsUs",
 BUILTIN(__builtin_bitreverse32, "UiUi", "nc")
 BUILTIN(__builtin_bitreverse64, "ULLiULLi", "nc")
 
-BUILTIN(__builtin_rotateleft8, "UcUcUc", "nc")
-BUILTIN(__builtin_rotateleft16, "UsUsUs", "nc")
-BUILTIN(__builtin_rotateleft32, "UiUiUi", "nc")
-BUILTIN(__builtin_rotateleft64, "ULLiULLiULLi", "nc")
-BUILTIN(__builtin_rotateright8, "UcUcUc", "nc")
-BUILTIN(__builtin_rotateright16, "UsUsUs", "nc")
-BUILTIN(__builtin_rotateright32, "UiUiUi", "nc")
-BUILTIN(__builtin_rotateright64, "ULLiULLiULLi", "nc")
-
 // Random GCC builtins
 BUILTIN(__builtin_constant_p, "i.", "nctu")
 BUILTIN(__builtin_classify_type, "i.", "nctu")

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=340136&r1=340135&r2=340136&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Sun Aug 19 06:48:06 2018
@@ -1647,6 +1647,46 @@ RValue CodeGenFunction::EmitBuiltinExpr(
  "cast");
 return RValue::get(Result);
   }
+  case Builtin::BI_rotr8:
+  case Builtin::BI_rotr16:
+  case Builtin::BI_rotr:
+  case Builtin::BI_lrotr:
+  case Builtin::BI_rotr64: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Value *Shift = EmitScalarExpr(E->getArg(1));
+
+llvm::Type *Arg

r340135 - [CodeGen] add rotate builtins

2018-08-19 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Aug 19 06:12:40 2018
New Revision: 340135

URL: http://llvm.org/viewvc/llvm-project?rev=340135&view=rev
Log:
[CodeGen] add rotate builtins

This exposes the LLVM funnel shift intrinsics as more familiar bit rotation 
functions in clang 
(when both halves of a funnel shift are the same value, it's a rotate).

We're free to name these as we want because we're not copying gcc, but if 
there's some other 
existing art (eg, the microsoft ops that are modified in this patch) that we 
want to replicate, 
we can change the names.

The funnel shift intrinsics were added here:
D49242

With improved codegen in:
rL337966
rL339359

And basic IR optimization added in:
rL338218
rL340022

...so these are expected to produce asm output that's equal or better to the 
multi-instruction 
alternatives using primitive C/IR ops.

In the motivating loop example from PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387#c7
...we get the expected 'rolq' x86 instructions if we substitute the rotate 
builtin into the source.

Differential Revision: https://reviews.llvm.org/D50924

Added:
cfe/trunk/test/CodeGen/builtin-rotate.c
Modified:
cfe/trunk/docs/LanguageExtensions.rst
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/ms-intrinsics-rotations.c

Modified: cfe/trunk/docs/LanguageExtensions.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/LanguageExtensions.rst?rev=340135&r1=340134&r2=340135&view=diff
==
--- cfe/trunk/docs/LanguageExtensions.rst (original)
+++ cfe/trunk/docs/LanguageExtensions.rst Sun Aug 19 06:12:40 2018
@@ -1739,6 +1739,70 @@ The '``__builtin_bitreverse``' family of
 the bitpattern of an integer value; for example ``0b10110110`` becomes
 ``0b01101101``.
 
+``__builtin_rotateleft``
+
+
+* ``__builtin_rotateleft8``
+* ``__builtin_rotateleft16``
+* ``__builtin_rotateleft32``
+* ``__builtin_rotateleft64``
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_rotateleft32(x, y)
+
+**Examples**:
+
+.. code-block:: c++
+
+  uint8_t rot_x = __builtin_rotateleft8(x, y);
+  uint16_t rot_x = __builtin_rotateleft16(x, y);
+  uint32_t rot_x = __builtin_rotateleft32(x, y);
+  uint64_t rot_x = __builtin_rotateleft64(x, y);
+
+**Description**:
+
+The '``__builtin_rotateleft``' family of builtins is used to rotate
+the bits in the first argument by the amount in the second argument. 
+For example, ``0b1110`` rotated left by 11 becomes ``0b00110100``.
+The shift value is treated as an unsigned amount modulo the size of
+the arguments. Both arguments and the result have the bitwidth specified
+by the name of the builtin.
+
+``__builtin_rotateright``
+
+
+* ``__builtin_rotateright8``
+* ``__builtin_rotateright16``
+* ``__builtin_rotateright32``
+* ``__builtin_rotateright64``
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_rotateright32(x, y)
+
+**Examples**:
+
+.. code-block:: c++
+
+  uint8_t rot_x = __builtin_rotateright8(x, y);
+  uint16_t rot_x = __builtin_rotateright16(x, y);
+  uint32_t rot_x = __builtin_rotateright32(x, y);
+  uint64_t rot_x = __builtin_rotateright64(x, y);
+
+**Description**:
+
+The '``__builtin_rotateright``' family of builtins is used to rotate
+the bits in the first argument by the amount in the second argument. 
+For example, ``0b1110`` rotated right by 3 becomes ``0b1101``.
+The shift value is treated as an unsigned amount modulo the size of
+the arguments. Both arguments and the result have the bitwidth specified
+by the name of the builtin.
+
 ``__builtin_unreachable``
 -
 

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=340135&r1=340134&r2=340135&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Sun Aug 19 06:12:40 2018
@@ -428,6 +428,15 @@ BUILTIN(__builtin_bitreverse16, "UsUs",
 BUILTIN(__builtin_bitreverse32, "UiUi", "nc")
 BUILTIN(__builtin_bitreverse64, "ULLiULLi", "nc")
 
+BUILTIN(__builtin_rotateleft8, "UcUcUc", "nc")
+BUILTIN(__builtin_rotateleft16, "UsUsUs", "nc")
+BUILTIN(__builtin_rotateleft32, "UiUiUi", "nc")
+BUILTIN(__builtin_rotateleft64, "ULLiULLiULLi", "nc")
+BUILTIN(__builtin_rotateright8, "UcUcUc", "nc")
+BUILTIN(__builtin_rotateright16, "UsUsUs", "nc")
+BUILTIN(__builtin_rotateright32, "UiUiUi", "nc")
+BUILTIN(__builtin_rotateright64, "ULLiULLiULLi", "nc")
+
 // Random GCC builtins
 BUILTIN(__builtin_constant_p, "i.", "nctu")
 BUILTIN(__builtin_classify_type, "i.", "nctu")

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=340135&r1=340134&r2=340135&view=diff
==

r334628 - [CodeGen] make nan builtins pure rather than const (PR37778)

2018-06-13 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Wed Jun 13 10:54:52 2018
New Revision: 334628

URL: http://llvm.org/viewvc/llvm-project?rev=334628&view=rev
Log:
[CodeGen] make nan builtins pure rather than const (PR37778)

https://bugs.llvm.org/show_bug.cgi?id=37778
...shows a miscompile resulting from marking nan builtins as 'const'.

The nan libcalls/builtins take a pointer argument:
http://www.cplusplus.com/reference/cmath/nan-function/
...and the chars dereferenced by that arg are used to fill in the NaN constant 
payload bits.

"const" means that the pointer argument isn't dereferenced. That's translated 
to "readnone" in LLVM.
"pure" means that the pointer argument may be dereferenced. That's translated 
to "readonly" in LLVM.

This change prevents the IR optimizer from killing the lead-up to the nan call 
here:

double a() {
  char buf[4];
  buf[0] = buf[1] = buf[2] = '9';
  buf[3] = '\0';
  return __builtin_nan(buf);
}

...the optimizer isn't currently able to simplify this to a constant as we 
might hope, 
but this patch should solve the miscompile.

Differential Revision: https://reviews.llvm.org/D48134

Modified:
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/test/CodeGen/math-builtins.c

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=334628&r1=334627&r2=334628&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Wed Jun 13 10:54:52 2018
@@ -137,14 +137,14 @@ BUILTIN(__builtin_ldexpl, "LdLdi", "Fne"
 BUILTIN(__builtin_modf , "ddd*"  , "Fn")
 BUILTIN(__builtin_modff, "fff*"  , "Fn")
 BUILTIN(__builtin_modfl, "LdLdLd*", "Fn")
-BUILTIN(__builtin_nan,  "dcC*" , "ncF")
-BUILTIN(__builtin_nanf, "fcC*" , "ncF")
-BUILTIN(__builtin_nanl, "LdcC*", "ncF")
-BUILTIN(__builtin_nanf128, "LLdcC*", "ncF")
-BUILTIN(__builtin_nans,  "dcC*" , "ncF")
-BUILTIN(__builtin_nansf, "fcC*" , "ncF")
-BUILTIN(__builtin_nansl, "LdcC*", "ncF")
-BUILTIN(__builtin_nansf128, "LLdcC*", "ncF")
+BUILTIN(__builtin_nan,  "dcC*" , "FnU")
+BUILTIN(__builtin_nanf, "fcC*" , "FnU")
+BUILTIN(__builtin_nanl, "LdcC*", "FnU")
+BUILTIN(__builtin_nanf128, "LLdcC*", "FnU")
+BUILTIN(__builtin_nans,  "dcC*" , "FnU")
+BUILTIN(__builtin_nansf, "fcC*" , "FnU")
+BUILTIN(__builtin_nansl, "LdcC*", "FnU")
+BUILTIN(__builtin_nansf128, "LLdcC*", "FnU")
 BUILTIN(__builtin_powi , "ddi"  , "Fnc")
 BUILTIN(__builtin_powif, "ffi"  , "Fnc")
 BUILTIN(__builtin_powil, "LdLdi", "Fnc")

Modified: cfe/trunk/test/CodeGen/math-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/math-builtins.c?rev=334628&r1=334627&r2=334628&view=diff
==
--- cfe/trunk/test/CodeGen/math-builtins.c (original)
+++ cfe/trunk/test/CodeGen/math-builtins.c Wed Jun 13 10:54:52 2018
@@ -90,25 +90,25 @@ void foo(double *d, float f, float *fp,
 
   __builtin_nan(c);__builtin_nanf(c);   __builtin_nanl(c); 
__builtin_nanf128(c);
 
-// NO__ERRNO: declare double @nan(i8*) [[READNONE]]
-// NO__ERRNO: declare float @nanf(i8*) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @nanl(i8*) [[READNONE]]
-// NO__ERRNO: declare fp128 @nanf128(i8*) [[READNONE]]
-// HAS_ERRNO: declare double @nan(i8*) [[READNONE:#[0-9]+]]
-// HAS_ERRNO: declare float @nanf(i8*) [[READNONE]]
-// HAS_ERRNO: declare x86_fp80 @nanl(i8*) [[READNONE]]
-// HAS_ERRNO: declare fp128 @nanf128(i8*) [[READNONE]]
+// NO__ERRNO: declare double @nan(i8*) [[PURE:#[0-9]+]]
+// NO__ERRNO: declare float @nanf(i8*) [[PURE]]
+// NO__ERRNO: declare x86_fp80 @nanl(i8*) [[PURE]]
+// NO__ERRNO: declare fp128 @nanf128(i8*) [[PURE]]
+// HAS_ERRNO: declare double @nan(i8*) [[PURE:#[0-9]+]]
+// HAS_ERRNO: declare float @nanf(i8*) [[PURE]]
+// HAS_ERRNO: declare x86_fp80 @nanl(i8*) [[PURE]]
+// HAS_ERRNO: declare fp128 @nanf128(i8*) [[PURE]]
 
   __builtin_nans(c);__builtin_nansf(c);   __builtin_nansl(c); 
__builtin_nansf128(c);
 
-// NO__ERRNO: declare double @nans(i8*) [[READNONE]]
-// NO__ERRNO: declare float @nansf(i8*) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @nansl(i8*) [[READNONE]]
-// NO__ERRNO: declare fp128 @nansf128(i8*) [[READNONE]]
-// HAS_ERRNO: declare double @nans(i8*) [[READNONE]]
-// HAS_ERRNO: declare float @nansf(i8*) [[READNONE]]
-// HAS_ERRNO: declare x86_fp80 @nansl(i8*) [[READNONE]]
-// HAS_ERRNO: declare fp128 @nansf128(i8*) [[READNONE]]
+// NO__ERRNO: declare double @nans(i8*) [[PURE]]
+// NO__ERRNO: declare float @nansf(i8*) [[PURE]]
+// NO__ERRNO: declare x86_fp80 @nansl(i8*) [[PURE]]
+// NO__ERRNO: declare fp128 @nansf128(i8*) [[PURE]]
+// HAS_ERRNO: declare double @nans(i8*) [[PURE]]
+// HAS_ERRNO: declare float @nansf(i8*) [[PURE]]
+// HAS_ERRNO: declare x86_fp80 @nansl(i8*) [[PURE]]
+// HAS_ERRNO: declare fp128 @nansf128(i8*) [[PURE]]
 
   __builtin_pow(f,f);__builtin_powf(f,f);  

r333038 - [CodeGen] use nsw negation for builtin abs

2018-05-22 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Tue May 22 16:02:13 2018
New Revision: 333038

URL: http://llvm.org/viewvc/llvm-project?rev=333038&view=rev
Log:
[CodeGen] use nsw negation for builtin abs

The clang builtins have the same semantics as the stdlib functions.
The stdlib functions are defined in section 7.20.6.1 of the C standard with:
"If the result cannot be represented, the behavior is undefined."

That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would
be UB/poison.

Differential Revision: https://reviews.llvm.org/D47202

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/builtin-abs.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=333038&r1=333037&r2=333038&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Tue May 22 16:02:13 2018
@@ -1252,8 +1252,9 @@ RValue CodeGenFunction::EmitBuiltinExpr(
   case Builtin::BI__builtin_labs:
   case Builtin::BI__builtin_llabs: {
 // X < 0 ? -X : X
+// The negation has 'nsw' because abs of INT_MIN is undefined.
 Value *ArgValue = EmitScalarExpr(E->getArg(0));
-Value *NegOp = Builder.CreateNeg(ArgValue, "neg");
+Value *NegOp = Builder.CreateNSWNeg(ArgValue, "neg");
 Constant *Zero = llvm::Constant::getNullValue(ArgValue->getType());
 Value *CmpResult = Builder.CreateICmpSLT(ArgValue, Zero, "abscond");
 Value *Result = Builder.CreateSelect(CmpResult, NegOp, ArgValue, "abs");

Modified: cfe/trunk/test/CodeGen/builtin-abs.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/builtin-abs.c?rev=333038&r1=333037&r2=333038&view=diff
==
--- cfe/trunk/test/CodeGen/builtin-abs.c (original)
+++ cfe/trunk/test/CodeGen/builtin-abs.c Tue May 22 16:02:13 2018
@@ -2,7 +2,7 @@
 
 int absi(int x) {
 // CHECK-LABEL: @absi(
-// CHECK:   [[NEG:%.*]] = sub i32 0, [[X:%.*]]
+// CHECK:   [[NEG:%.*]] = sub nsw i32 0, [[X:%.*]]
 // CHECK:   [[CMP:%.*]] = icmp slt i32 [[X]], 0
 // CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i32 [[NEG]], i32 [[X]]
 //
@@ -11,7 +11,7 @@ int absi(int x) {
 
 long absl(long x) {
 // CHECK-LABEL: @absl(
-// CHECK:   [[NEG:%.*]] = sub i64 0, [[X:%.*]]
+// CHECK:   [[NEG:%.*]] = sub nsw i64 0, [[X:%.*]]
 // CHECK:   [[CMP:%.*]] = icmp slt i64 [[X]], 0
 // CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i64 [[NEG]], i64 [[X]]
 //
@@ -20,7 +20,7 @@ long absl(long x) {
 
 long long absll(long long x) {
 // CHECK-LABEL: @absll(
-// CHECK:   [[NEG:%.*]] = sub i64 0, [[X:%.*]]
+// CHECK:   [[NEG:%.*]] = sub nsw i64 0, [[X:%.*]]
 // CHECK:   [[CMP:%.*]] = icmp slt i64 [[X]], 0
 // CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i64 [[NEG]], i64 [[X]]
 //


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r332989 - [CodeGen] produce the LLVM canonical form of abs

2018-05-22 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Tue May 22 08:36:50 2018
New Revision: 332989

URL: http://llvm.org/viewvc/llvm-project?rev=332989&view=rev
Log:
[CodeGen] produce the LLVM canonical form of abs

We chose the 'slt' form as canonical in IR with:
rL332819
...so we should generate that form directly for efficiency.

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/builtin-abs.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=332989&r1=332988&r2=332989&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Tue May 22 08:36:50 2018
@@ -1251,16 +1251,12 @@ RValue CodeGenFunction::EmitBuiltinExpr(
   case Builtin::BI__builtin_abs:
   case Builtin::BI__builtin_labs:
   case Builtin::BI__builtin_llabs: {
+// X < 0 ? -X : X
 Value *ArgValue = EmitScalarExpr(E->getArg(0));
-
 Value *NegOp = Builder.CreateNeg(ArgValue, "neg");
-Value *CmpResult =
-Builder.CreateICmpSGE(ArgValue,
-  llvm::Constant::getNullValue(ArgValue->getType()),
-"abscond");
-Value *Result =
-  Builder.CreateSelect(CmpResult, ArgValue, NegOp, "abs");
-
+Constant *Zero = llvm::Constant::getNullValue(ArgValue->getType());
+Value *CmpResult = Builder.CreateICmpSLT(ArgValue, Zero, "abscond");
+Value *Result = Builder.CreateSelect(CmpResult, NegOp, ArgValue, "abs");
 return RValue::get(Result);
   }
   case Builtin::BI__builtin_conj:

Modified: cfe/trunk/test/CodeGen/builtin-abs.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/builtin-abs.c?rev=332989&r1=332988&r2=332989&view=diff
==
--- cfe/trunk/test/CodeGen/builtin-abs.c (original)
+++ cfe/trunk/test/CodeGen/builtin-abs.c Tue May 22 08:36:50 2018
@@ -3,8 +3,8 @@
 int absi(int x) {
 // CHECK-LABEL: @absi(
 // CHECK:   [[NEG:%.*]] = sub i32 0, [[X:%.*]]
-// CHECK:   [[CMP:%.*]] = icmp sge i32 [[X]], 0
-// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i32 [[X]], i32 [[NEG]]
+// CHECK:   [[CMP:%.*]] = icmp slt i32 [[X]], 0
+// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i32 [[NEG]], i32 [[X]]
 //
   return __builtin_abs(x);
 }
@@ -12,8 +12,8 @@ int absi(int x) {
 long absl(long x) {
 // CHECK-LABEL: @absl(
 // CHECK:   [[NEG:%.*]] = sub i64 0, [[X:%.*]]
-// CHECK:   [[CMP:%.*]] = icmp sge i64 [[X]], 0
-// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i64 [[X]], i64 [[NEG]]
+// CHECK:   [[CMP:%.*]] = icmp slt i64 [[X]], 0
+// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i64 [[NEG]], i64 [[X]]
 //
   return __builtin_labs(x);
 }
@@ -21,8 +21,8 @@ long absl(long x) {
 long long absll(long long x) {
 // CHECK-LABEL: @absll(
 // CHECK:   [[NEG:%.*]] = sub i64 0, [[X:%.*]]
-// CHECK:   [[CMP:%.*]] = icmp sge i64 [[X]], 0
-// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i64 [[X]], i64 [[NEG]]
+// CHECK:   [[CMP:%.*]] = icmp slt i64 [[X]], 0
+// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i64 [[NEG]], i64 [[X]]
 //
   return __builtin_llabs(x);
 }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r332988 - [CodeGen] add tests for abs builtins; NFC

2018-05-22 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Tue May 22 08:11:59 2018
New Revision: 332988

URL: http://llvm.org/viewvc/llvm-project?rev=332988&view=rev
Log:
[CodeGen] add tests for abs builtins; NFC

Added:
cfe/trunk/test/CodeGen/builtin-abs.c

Added: cfe/trunk/test/CodeGen/builtin-abs.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/builtin-abs.c?rev=332988&view=auto
==
--- cfe/trunk/test/CodeGen/builtin-abs.c (added)
+++ cfe/trunk/test/CodeGen/builtin-abs.c Tue May 22 08:11:59 2018
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm %s -o - | 
FileCheck %s
+
+int absi(int x) {
+// CHECK-LABEL: @absi(
+// CHECK:   [[NEG:%.*]] = sub i32 0, [[X:%.*]]
+// CHECK:   [[CMP:%.*]] = icmp sge i32 [[X]], 0
+// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i32 [[X]], i32 [[NEG]]
+//
+  return __builtin_abs(x);
+}
+
+long absl(long x) {
+// CHECK-LABEL: @absl(
+// CHECK:   [[NEG:%.*]] = sub i64 0, [[X:%.*]]
+// CHECK:   [[CMP:%.*]] = icmp sge i64 [[X]], 0
+// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i64 [[X]], i64 [[NEG]]
+//
+  return __builtin_labs(x);
+}
+
+long long absll(long long x) {
+// CHECK-LABEL: @absll(
+// CHECK:   [[NEG:%.*]] = sub i64 0, [[X:%.*]]
+// CHECK:   [[CMP:%.*]] = icmp sge i64 [[X]], 0
+// CHECK:   [[SEL:%.*]] = select i1 [[CMP]], i64 [[X]], i64 [[NEG]]
+//
+  return __builtin_llabs(x);
+}
+


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r332473 - [OpenCL] make test independent of optimizer

2018-05-16 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Wed May 16 07:38:07 2018
New Revision: 332473

URL: http://llvm.org/viewvc/llvm-project?rev=332473&view=rev
Log:
[OpenCL] make test independent of optimizer

There shouldn't be any tests that run the entire optimizer here,
but the last test in this file is definitely going to break with 
a change in LLVM IR canonicalization. Change that part to check
the unoptimized IR because that's the real intent of this file.

Modified:
cfe/trunk/test/CodeGenOpenCL/shifts.cl

Modified: cfe/trunk/test/CodeGenOpenCL/shifts.cl
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenOpenCL/shifts.cl?rev=332473&r1=332472&r2=332473&view=diff
==
--- cfe/trunk/test/CodeGenOpenCL/shifts.cl (original)
+++ cfe/trunk/test/CodeGenOpenCL/shifts.cl Wed May 16 07:38:07 2018
@@ -58,16 +58,17 @@ int4 vectorVectorTest(int4 a,int4 b) {
   return f;
 }
 
-//OPT: @vectorScalarTest
+//NOOPT-LABEL: @vectorScalarTest
 int4 vectorScalarTest(int4 a,int b) {
-  //OPT: [[SP0:%.+]] = insertelement <4 x i32> undef, i32 %b, i32 0
-  //OPT: [[SP1:%.+]] = shufflevector <4 x i32> [[SP0]], <4 x i32> undef, <4 x 
i32> zeroinitializer
-  //OPT: [[VSM:%.+]] = and <4 x i32> [[SP1]], 
-  //OPT-NEXT: [[VSC:%.+]] = shl <4 x i32> %a, [[VSM]]
+  //NOOPT: [[SP0:%.+]] = insertelement <4 x i32> undef
+  //NOOPT: [[SP1:%.+]] = shufflevector <4 x i32> [[SP0]], <4 x i32> undef, <4 
x i32> zeroinitializer
+  //NOOPT: [[VSM:%.+]] = and <4 x i32> [[SP1]], 
+  //NOOPT: [[VSC:%.+]] = shl <4 x i32> [[VSS:%.+]], [[VSM]]
   int4 c = a << b;
-  //OPT-NEXT: [[VSF:%.+]] = add <4 x i32> [[VSC]], 
+  //NOOPT: [[VSF:%.+]] = shl <4 x i32> [[VSC1:%.+]], 
+  //NOOPT: [[VSA:%.+]] = add <4 x i32> [[VSC2:%.+]], [[VSF]]
   int4 d = {1, 1, 1, 1};
   int4 f = c + (d << 34);
-  //OPT-NEXT: ret <4 x i32> [[VSF]]
+  //NOOPT: ret <4 x i32>
   return f;
 }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r331209 - [Driver, CodeGen] rename options to disable an FP cast optimization

2018-04-30 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Mon Apr 30 11:19:03 2018
New Revision: 331209

URL: http://llvm.org/viewvc/llvm-project?rev=331209&view=rev
Log:
[Driver, CodeGen] rename options to disable an FP cast optimization

As suggested in the post-commit thread for rL331056, we should match these 
clang options with the established vocabulary of the corresponding sanitizer
option. Also, the use of 'strict' is well-known for these kinds of knobs, 
and we can improve the descriptive text in the docs.

So this intends to match the logic of D46135 but only change the words.
Matching LLVM commit to match this spelling of the attribute to follow shortly.

Differential Revision: https://reviews.llvm.org/D46236

Modified:
cfe/trunk/docs/ReleaseNotes.rst
cfe/trunk/docs/UsersManual.rst
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/include/clang/Frontend/CodeGenOptions.def
cfe/trunk/lib/CodeGen/CGCall.cpp
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/test/CodeGen/no-junk-ftrunc.c
cfe/trunk/test/Driver/fast-math.c

Modified: cfe/trunk/docs/ReleaseNotes.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=331209&r1=331208&r2=331209&view=diff
==
--- cfe/trunk/docs/ReleaseNotes.rst (original)
+++ cfe/trunk/docs/ReleaseNotes.rst Mon Apr 30 11:19:03 2018
@@ -89,14 +89,13 @@ Non-comprehensive list of changes in thi
 New Compiler Flags
 --
 
-- :option:`-ffp-cast-overflow-workaround` and
-  :option:`-fno-fp-cast-overflow-workaround`
-  enable (disable) a workaround for code that casts floating-point values to
-  integers and back to floating-point. If the floating-point value is not
-  representable in the intermediate integer type, the code is incorrect
-  according to the language standard. This flag will attempt to generate code
-  as if the result of an overflowing conversion matches the overflowing 
behavior
-  of a target's native float-to-int conversion instructions.
+- :option:`-fstrict-float-cast-overflow` and
+  :option:`-fno-strict-float-cast-overflow` -
+   When a floating-point value is not representable in a destination integer
+   type, the code has undefined behavior according to the language standard.
+   By default, Clang will not guarantee any particular result in that case.
+   With the 'no-strict' option, Clang attempts to match the overflowing 
behavior
+   of the target's native float-to-int conversion instructions.
 
 - ...
 

Modified: cfe/trunk/docs/UsersManual.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/UsersManual.rst?rev=331209&r1=331208&r2=331209&view=diff
==
--- cfe/trunk/docs/UsersManual.rst (original)
+++ cfe/trunk/docs/UsersManual.rst Mon Apr 30 11:19:03 2018
@@ -1255,15 +1255,13 @@ are listed below.
flushed-to-zero number is preserved in the sign of 0, denormals are
flushed to positive zero, respectively.
 
-.. option:: -f[no-]fp-cast-overflow-workaround
+.. option:: -f[no-]strict-float-cast-overflow
 
-   Enable a workaround for code that casts floating-point values to 
-   integers and back to floating-point. If the floating-point value 
-   is not representable in the intermediate integer type, the code is
-   incorrect according to the language standard. This flag will attempt 
-   to generate code as if the result of an overflowing conversion matches
-   the overflowing behavior of a target's native float-to-int conversion
-   instructions.
+   When a floating-point value is not representable in a destination integer 
+   type, the code has undefined behavior according to the language standard.
+   By default, Clang will not guarantee any particular result in that case.
+   With the 'no-strict' option, Clang attempts to match the overflowing 
behavior
+   of the target's native float-to-int conversion instructions.
 
 .. option:: -fwhole-program-vtables
 

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=331209&r1=331208&r2=331209&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Mon Apr 30 11:19:03 2018
@@ -1029,10 +1029,12 @@ def ffp_contract : Joined<["-"], "ffp-co
   Flags<[CC1Option]>, HelpText<"Form fused FP ops (e.g. FMAs): fast 
(everywhere)"
   " | on (according to FP_CONTRACT pragma, default) | off (never fuse)">, 
Values<"fast,on,off">;
 
-def ffp_cast_overflow_workaround : Flag<["-"],
-  "ffp-cast-overflow-workaround">, Group, Flags<[CC1Option]>;
-def fno_fp_cast_overflow_workaround : Flag<["-"],
-  "fno-fp-cast-overflow-workaround">, Group, Flags<[CC1Option]>;
+def fstrict_float_cast_overflow : Flag<["-"],
+  "fstrict-float-cast-overflow">, Group, Flags<

Re: r331056 - [docs] add -ffp-cast-overflow-workaround to the release notes

2018-04-29 Thread Sanjay Patel via cfe-commits
Patches to improve the language posted for review:
https://reviews.llvm.org/D46236
https://reviews.llvm.org/D46237

On Fri, Apr 27, 2018 at 7:41 PM, Chandler Carruth via cfe-commits <
cfe-commits@lists.llvm.org> wrote:

> On Fri, Apr 27, 2018 at 5:13 PM Richard Smith 
> wrote:
>
>> On 27 April 2018 at 17:09, Chandler Carruth via cfe-commits <
>> cfe-commits@lists.llvm.org> wrote:
>>
>>> On Fri, Apr 27, 2018 at 4:36 PM Richard Smith via cfe-commits <
>>> cfe-commits@lists.llvm.org> wrote:
>>>
>>>> On 27 April 2018 at 16:07, Sanjay Patel via cfe-commits <
>>>> cfe-commits@lists.llvm.org> wrote:
>>>>
>>>>> Missing dash corrected at r331057. I can improve the doc wording, but
>>>>> let's settle on the flag name first, and I'll try to get it all fixed up 
>>>>> in
>>>>> one shot.
>>>>>
>>>>> So far we have these candidates:
>>>>> 1. -ffp-cast-overflow-workaround
>>>>> 2. -fstrict-fp-trunc-semantics
>>>>> 3. -fstrict-fp-cast-overflow
>>>>>
>>>>> I don't have a strong opinion here, but on 2nd reading, it does seem
>>>>> like a 'strict' flag fits better with existing options.
>>>>>
>>>>
>>>> The corresponding UBSan check is called -fsanitize=float-cast-overflow,
>>>> so maybe -fno-strict-float-cast-overflow would be the most consistent
>>>> name?
>>>>
>>>
>>> On this topic: we were hit by this on a *lot* of code. All of that code
>>> builds and passes tests with -fsanitize=float-cast-overflow. So I think
>>> we've been mistaken in assuming that this sanitizer catches all of the
>>> failure modes of the optimization. That at least impacts the sanitizer
>>> suggestion in the release notes. And probably impacts the flag name /
>>> attribuet name.
>>>
>>
>> That's interesting, and definitely sounds like a bug (either the
>> sanitizer or LLVM is presumably getting the range check wrong). Can you
>> point me at an example?
>>
>
> It appears that the code that hit this has cleverly dodged *EVERY* build
> configuration we have deployed this sanitizer in... despite our best
> efforts. Sorry for the noise.
>
>
>>
>>
>>> On Fri, Apr 27, 2018 at 4:41 PM, Richard Smith 
>>>>> wrote:
>>>>>
>>>>>> On 27 April 2018 at 09:21, Sanjay Patel via cfe-commits <
>>>>>> cfe-commits@lists.llvm.org> wrote:
>>>>>>
>>>>>>> Author: spatel
>>>>>>> Date: Fri Apr 27 09:21:22 2018
>>>>>>> New Revision: 331056
>>>>>>>
>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=331056&view=rev
>>>>>>> Log:
>>>>>>> [docs] add -ffp-cast-overflow-workaround to the release notes
>>>>>>>
>>>>>>> This option was added with:
>>>>>>> D46135
>>>>>>> rL331041
>>>>>>> ...copying the text from UsersManual.rst for more exposure.
>>>>>>>
>>>>>>>
>>>>>>> Modified:
>>>>>>> cfe/trunk/docs/ReleaseNotes.rst
>>>>>>>
>>>>>>> Modified: cfe/trunk/docs/ReleaseNotes.rst
>>>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/
>>>>>>> ReleaseNotes.rst?rev=331056&r1=331055&r2=331056&view=diff
>>>>>>> 
>>>>>>> ==
>>>>>>> --- cfe/trunk/docs/ReleaseNotes.rst (original)
>>>>>>> +++ cfe/trunk/docs/ReleaseNotes.rst Fri Apr 27 09:21:22 2018
>>>>>>> @@ -83,6 +83,15 @@ Non-comprehensive list of changes in thi
>>>>>>>  New Compiler Flags
>>>>>>>  --
>>>>>>>
>>>>>>> +- :option:`-ffp-cast-overflow-workaround` and
>>>>>>> +  :option:`-fnofp-cast-overflow-workaround`
>>>>>>>
>>>>>>
>>>>>> Shouldn't this be -fno-fp-cast-overflow-workaround?
>>>>>>
>>>>>> Also, our convention for flags that define undefined behavior is
>>>>>> `-fno-strict-*`, so perhaps this should be `-fno-strict-fp-cast-overflow`
>>&g

Re: r331056 - [docs] add -ffp-cast-overflow-workaround to the release notes

2018-04-27 Thread Sanjay Patel via cfe-commits
Missing dash corrected at r331057. I can improve the doc wording, but let's
settle on the flag name first, and I'll try to get it all fixed up in one
shot.

So far we have these candidates:
1. -ffp-cast-overflow-workaround
2. -fstrict-fp-trunc-semantics
3. -fstrict-fp-cast-overflow

I don't have a strong opinion here, but on 2nd reading, it does seem like a
'strict' flag fits better with existing options.


On Fri, Apr 27, 2018 at 4:41 PM, Richard Smith 
wrote:

> On 27 April 2018 at 09:21, Sanjay Patel via cfe-commits <
> cfe-commits@lists.llvm.org> wrote:
>
>> Author: spatel
>> Date: Fri Apr 27 09:21:22 2018
>> New Revision: 331056
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=331056&view=rev
>> Log:
>> [docs] add -ffp-cast-overflow-workaround to the release notes
>>
>> This option was added with:
>> D46135
>> rL331041
>> ...copying the text from UsersManual.rst for more exposure.
>>
>>
>> Modified:
>> cfe/trunk/docs/ReleaseNotes.rst
>>
>> Modified: cfe/trunk/docs/ReleaseNotes.rst
>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNo
>> tes.rst?rev=331056&r1=331055&r2=331056&view=diff
>> 
>> ==
>> --- cfe/trunk/docs/ReleaseNotes.rst (original)
>> +++ cfe/trunk/docs/ReleaseNotes.rst Fri Apr 27 09:21:22 2018
>> @@ -83,6 +83,15 @@ Non-comprehensive list of changes in thi
>>  New Compiler Flags
>>  --
>>
>> +- :option:`-ffp-cast-overflow-workaround` and
>> +  :option:`-fnofp-cast-overflow-workaround`
>>
>
> Shouldn't this be -fno-fp-cast-overflow-workaround?
>
> Also, our convention for flags that define undefined behavior is
> `-fno-strict-*`, so perhaps this should be `-fno-strict-fp-cast-overflow`?
>
>
>> +  enable (disable) a workaround for code that casts floating-point
>> values to
>> +  integers and back to floating-point. If the floating-point value is not
>> +  representable in the intermediate integer type, the code is incorrect
>> +  according to the language standard.
>
>
> I find this hard to read: I initially misread "the code is incorrect
> according to the language standard" as meaning "Clang will generate code
> that is incorrect according to the language standard". I think what you
> mean here is "the code has undefined behavior according to the language
> standard, and Clang will not guarantee any particular result. This flag
> causes the behavior to be defined to match the overflowing behavior of the
> target's native float-to-int conversion instructions."
>
>
>> This flag will attempt to generate code
>> +  as if the result of an overflowing conversion matches the overflowing
>> behavior
>> +  of a target's native float-to-int conversion instructions.
>> +
>>  - ...
>>
>>  Deprecated Compiler Flags
>>
>>
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r331057 - [docs] more dashes

2018-04-27 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Apr 27 09:24:39 2018
New Revision: 331057

URL: http://llvm.org/viewvc/llvm-project?rev=331057&view=rev
Log:
[docs] more dashes

Modified:
cfe/trunk/docs/ReleaseNotes.rst

Modified: cfe/trunk/docs/ReleaseNotes.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=331057&r1=331056&r2=331057&view=diff
==
--- cfe/trunk/docs/ReleaseNotes.rst (original)
+++ cfe/trunk/docs/ReleaseNotes.rst Fri Apr 27 09:24:39 2018
@@ -84,7 +84,7 @@ New Compiler Flags
 --
 
 - :option:`-ffp-cast-overflow-workaround` and
-  :option:`-fnofp-cast-overflow-workaround`
+  :option:`-fno-fp-cast-overflow-workaround`
   enable (disable) a workaround for code that casts floating-point values to
   integers and back to floating-point. If the floating-point value is not
   representable in the intermediate integer type, the code is incorrect


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r331056 - [docs] add -ffp-cast-overflow-workaround to the release notes

2018-04-27 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Apr 27 09:21:22 2018
New Revision: 331056

URL: http://llvm.org/viewvc/llvm-project?rev=331056&view=rev
Log:
[docs] add -ffp-cast-overflow-workaround to the release notes

This option was added with:
D46135
rL331041
...copying the text from UsersManual.rst for more exposure.


Modified:
cfe/trunk/docs/ReleaseNotes.rst

Modified: cfe/trunk/docs/ReleaseNotes.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=331056&r1=331055&r2=331056&view=diff
==
--- cfe/trunk/docs/ReleaseNotes.rst (original)
+++ cfe/trunk/docs/ReleaseNotes.rst Fri Apr 27 09:21:22 2018
@@ -83,6 +83,15 @@ Non-comprehensive list of changes in thi
 New Compiler Flags
 --
 
+- :option:`-ffp-cast-overflow-workaround` and
+  :option:`-fnofp-cast-overflow-workaround`
+  enable (disable) a workaround for code that casts floating-point values to
+  integers and back to floating-point. If the floating-point value is not
+  representable in the intermediate integer type, the code is incorrect
+  according to the language standard. This flag will attempt to generate code
+  as if the result of an overflowing conversion matches the overflowing 
behavior
+  of a target's native float-to-int conversion instructions.
+
 - ...
 
 Deprecated Compiler Flags


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r331041 - [Driver, CodeGen] add options to enable/disable an FP cast optimization

2018-04-27 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Apr 27 07:22:48 2018
New Revision: 331041

URL: http://llvm.org/viewvc/llvm-project?rev=331041&view=rev
Log:
[Driver, CodeGen] add options to enable/disable an FP cast optimization

As discussed in the post-commit thread for:
rL330437 ( 
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html )

We need a way to opt-out of a float-to-int-to-float cast optimization because 
too much 
existing code relies on the platform-specific undefined result of those casts 
when the 
float-to-int overflows.

The LLVM changes associated with adding this function attribute are here:
rL330947
rL330950
rL330951

Also as suggested, I changed the LLVM doc to mention the specific sanitizer 
flag that 
catches this problem:
rL330958

Differential Revision: https://reviews.llvm.org/D46135

Added:
cfe/trunk/test/CodeGen/no-junk-ftrunc.c
Modified:
cfe/trunk/docs/UsersManual.rst
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/include/clang/Frontend/CodeGenOptions.def
cfe/trunk/lib/CodeGen/CGCall.cpp
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/test/Driver/fast-math.c

Modified: cfe/trunk/docs/UsersManual.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/UsersManual.rst?rev=331041&r1=331040&r2=331041&view=diff
==
--- cfe/trunk/docs/UsersManual.rst (original)
+++ cfe/trunk/docs/UsersManual.rst Fri Apr 27 07:22:48 2018
@@ -1255,6 +1255,16 @@ are listed below.
flushed-to-zero number is preserved in the sign of 0, denormals are
flushed to positive zero, respectively.
 
+.. option:: -f[no-]fp-cast-overflow-workaround
+
+   Enable a workaround for code that casts floating-point values to 
+   integers and back to floating-point. If the floating-point value 
+   is not representable in the intermediate integer type, the code is
+   incorrect according to the language standard. This flag will attempt 
+   to generate code as if the result of an overflowing conversion matches
+   the overflowing behavior of a target's native float-to-int conversion
+   instructions.
+
 .. option:: -fwhole-program-vtables
 
Enable whole-program vtable optimizations, such as single-implementation

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=331041&r1=331040&r2=331041&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Fri Apr 27 07:22:48 2018
@@ -1029,6 +1029,11 @@ def ffp_contract : Joined<["-"], "ffp-co
   Flags<[CC1Option]>, HelpText<"Form fused FP ops (e.g. FMAs): fast 
(everywhere)"
   " | on (according to FP_CONTRACT pragma, default) | off (never fuse)">, 
Values<"fast,on,off">;
 
+def ffp_cast_overflow_workaround : Flag<["-"],
+  "ffp-cast-overflow-workaround">, Group, Flags<[CC1Option]>;
+def fno_fp_cast_overflow_workaround : Flag<["-"],
+  "fno-fp-cast-overflow-workaround">, Group, Flags<[CC1Option]>;
+
 def ffor_scope : Flag<["-"], "ffor-scope">, Group;
 def fno_for_scope : Flag<["-"], "fno-for-scope">, Group;
 

Modified: cfe/trunk/include/clang/Frontend/CodeGenOptions.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Frontend/CodeGenOptions.def?rev=331041&r1=331040&r2=331041&view=diff
==
--- cfe/trunk/include/clang/Frontend/CodeGenOptions.def (original)
+++ cfe/trunk/include/clang/Frontend/CodeGenOptions.def Fri Apr 27 07:22:48 2018
@@ -136,6 +136,12 @@ CODEGENOPT(NoTrappingMath, 1, 0) ///
 CODEGENOPT(NoNaNsFPMath  , 1, 0) ///< Assume FP arguments, results not NaN.
 CODEGENOPT(FlushDenorm   , 1, 0) ///< Allow FP denorm numbers to be 
flushed to zero
 CODEGENOPT(CorrectlyRoundedDivSqrt, 1, 0) ///< 
-cl-fp32-correctly-rounded-divide-sqrt
+
+/// Disable a float-to-int-to-float cast optimization. This attempts to 
generate
+/// code as if the result of an overflowing conversion matches the overflowing
+/// behavior of a target's native float-to-int conversion instructions.
+CODEGENOPT(FPCastOverflowWorkaround, 1, 0)
+
 CODEGENOPT(UniformWGSize , 1, 0) ///< -cl-uniform-work-group-size
 CODEGENOPT(NoZeroInitializedInBSS , 1, 0) ///< -fno-zero-initialized-in-bss.
 /// \brief Method of Objective-C dispatch to use.

Modified: cfe/trunk/lib/CodeGen/CGCall.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCall.cpp?rev=331041&r1=331040&r2=331041&view=diff
==
--- cfe/trunk/lib/CodeGen/CGCall.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGCall.cpp Fri Apr 27 07:22:48 2018
@@ -1727,6 +1727,9 @@ void CodeGenModule::ConstructDefaultFnAt
 FuncAttrs.addAttribute("no-trapping-math",
   

r322950 - [CodeGenCXX] annotate a GEP to a derived class with 'inbounds' (PR35909)

2018-01-19 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Jan 19 07:14:51 2018
New Revision: 322950

URL: http://llvm.org/viewvc/llvm-project?rev=322950&view=rev
Log:
[CodeGenCXX] annotate a GEP to a derived class with 'inbounds' (PR35909)

The standard says:
[expr.static.cast] p11: "If the prvalue of type “pointer to cv1 B” points to a 
B 
that is actually a subobject of an object of type D, the resulting pointer 
points 
to the enclosing object of type D. Otherwise, the behavior is undefined."

Therefore, the GEP must be inbounds.

This should solve the failure to optimize away a null check shown in PR35909:
https://bugs.llvm.org/show_bug.cgi?id=35909 

Differential Revision: https://reviews.llvm.org/D42249

Added:
cfe/trunk/test/CodeGenCXX/derived-cast.cpp
Modified:
cfe/trunk/lib/CodeGen/CGClass.cpp
cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp

Modified: cfe/trunk/lib/CodeGen/CGClass.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGClass.cpp?rev=322950&r1=322949&r2=322950&view=diff
==
--- cfe/trunk/lib/CodeGen/CGClass.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGClass.cpp Fri Jan 19 07:14:51 2018
@@ -406,8 +406,8 @@ CodeGenFunction::GetAddressOfDerivedClas
 
   // Apply the offset.
   llvm::Value *Value = Builder.CreateBitCast(BaseAddr.getPointer(), Int8PtrTy);
-  Value = Builder.CreateGEP(Value, Builder.CreateNeg(NonVirtualOffset),
-"sub.ptr");
+  Value = Builder.CreateInBoundsGEP(Value, Builder.CreateNeg(NonVirtualOffset),
+"sub.ptr");
 
   // Just cast.
   Value = Builder.CreateBitCast(Value, DerivedPtrTy);

Modified: cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp?rev=322950&r1=322949&r2=322950&view=diff
==
--- cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp (original)
+++ cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp Fri Jan 19 07:14:51 2018
@@ -371,7 +371,7 @@ class C : public A, public B // align=16
 void downcast_pointer(B *b) {
   (void) static_cast(b);
   // Alignment check from EmitTypeCheck(TCK_DowncastPointer, ...)
-  // CHECK: [[SUB:%[.a-z0-9]*]] = getelementptr i8, i8* {{.*}}, i64 -16
+  // CHECK: [[SUB:%[.a-z0-9]*]] = getelementptr inbounds i8, i8* {{.*}}, i64 
-16
   // CHECK-NEXT: [[C:%.+]] = bitcast i8* [[SUB]] to %class.C*
   // null check goes here
   // CHECK: [[FROM_PHI:%.+]] = phi %class.C* [ [[C]], {{.*}} ], {{.*}}
@@ -388,7 +388,7 @@ void downcast_pointer(B *b) {
 void downcast_reference(B &b) {
   (void) static_cast(b);
   // Alignment check from EmitTypeCheck(TCK_DowncastReference, ...)
-  // CHECK:  [[SUB:%[.a-z0-9]*]] = getelementptr i8, i8* {{.*}}, i64 -16
+  // CHECK:  [[SUB:%[.a-z0-9]*]] = getelementptr inbounds i8, i8* {{.*}}, 
i64 -16
   // CHECK-NEXT: [[C:%.+]] = bitcast i8* [[SUB]] to %class.C*
   // Objectsize check goes here
   // CHECK:  [[C_INT:%.+]] = ptrtoint %class.C* [[C]] to i64

Added: cfe/trunk/test/CodeGenCXX/derived-cast.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCXX/derived-cast.cpp?rev=322950&view=auto
==
--- cfe/trunk/test/CodeGenCXX/derived-cast.cpp (added)
+++ cfe/trunk/test/CodeGenCXX/derived-cast.cpp Fri Jan 19 07:14:51 2018
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm %s -o - | 
FileCheck %s
+
+class A {
+int a;
+};
+
+class B {
+int b;
+public:
+A *getAsA();
+};
+
+class X : public A, public B {
+int x;
+};
+
+// PR35909 - https://bugs.llvm.org/show_bug.cgi?id=35909
+
+A *B::getAsA() {
+  return static_cast(this);
+
+  // CHECK-LABEL: define %class.A* @_ZN1B6getAsAEv
+  // CHECK: %[[THIS:.*]] = load %class.B*, %class.B**
+  // CHECK-NEXT: %[[BC:.*]] = bitcast %class.B* %[[THIS]] to i8*
+  // CHECK-NEXT: getelementptr inbounds i8, i8* %[[BC]], i64 -4
+}
+


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r320920 - [Driver, CodeGen] pass through and apply -fassociative-math

2017-12-16 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sat Dec 16 08:11:17 2017
New Revision: 320920

URL: http://llvm.org/viewvc/llvm-project?rev=320920&view=rev
Log:
[Driver, CodeGen] pass through and apply -fassociative-math

There are 2 parts to getting the -fassociative-math command-line flag 
translated to LLVM FMF:

1. In the driver/frontend, we accept the flag and its 'no' inverse and deal 
with the 
   interactions with other flags like -ffast-math -fno-signed-zeros 
-fno-trapping-math. 
   This was mostly already done - we just need to translate the flag as a 
codegen option. 
   The test file is complicated because there are many potential combinations 
of flags here.
   Note that we are matching gcc's behavior that requires 'nsz' and 
no-trapping-math.

2. In codegen, we map the codegen option to FMF in the IR builder. This is 
simple code and 
   corresponding test.

For the motivating example from PR27372:

float foo(float a, float x) { return ((a + x) - x); }

$ ./clang -O2 27372.c -S -o - -ffast-math  -fno-associative-math -emit-llvm  | 
egrep 'fadd|fsub'
  %add = fadd nnan ninf nsz arcp contract float %0, %1
  %sub = fsub nnan ninf nsz arcp contract float %add, %2

So 'reassoc' is off as expected (and so is the new 'afn' but that's a different 
patch). 
This case now works as expected end-to-end although the underlying logic is 
still wrong:

$ ./clang  -O2 27372.c -S -o - -ffast-math  -fno-associative-math | grep xmm
addss   %xmm1, %xmm0
subss   %xmm1, %xmm0

We're not done because the case where 'reassoc' is set is ignored by optimizer 
passes. Example:

$ ./clang  -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros 
-fno-trapping-math -emit-llvm  | grep fadd
  %add = fadd reassoc float %0, %1

$ ./clang -O2  27372.c -S -o - -fassociative-math -fno-signed-zeros 
-fno-trapping-math | grep xmm
addss   %xmm1, %xmm0
subss   %xmm1, %xmm0

Differential Revision: https://reviews.llvm.org/D39812

Modified:
cfe/trunk/include/clang/Driver/CC1Options.td
cfe/trunk/include/clang/Frontend/CodeGenOptions.def
cfe/trunk/lib/CodeGen/CodeGenFunction.cpp
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/test/CodeGen/finite-math.c
cfe/trunk/test/Driver/fast-math.c

Modified: cfe/trunk/include/clang/Driver/CC1Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/CC1Options.td?rev=320920&r1=320919&r2=320920&view=diff
==
--- cfe/trunk/include/clang/Driver/CC1Options.td (original)
+++ cfe/trunk/include/clang/Driver/CC1Options.td Sat Dec 16 08:11:17 2017
@@ -263,6 +263,8 @@ def menable_no_nans : Flag<["-"], "menab
 def menable_unsafe_fp_math : Flag<["-"], "menable-unsafe-fp-math">,
   HelpText<"Allow unsafe floating-point math optimizations which may decrease "
"precision">;
+def mreassociate : Flag<["-"], "mreassociate">,
+  HelpText<"Allow reassociation transformations for floating-point 
instructions">;
 def mfloat_abi : Separate<["-"], "mfloat-abi">,
   HelpText<"The float ABI to use">;
 def mtp : Separate<["-"], "mtp">,

Modified: cfe/trunk/include/clang/Frontend/CodeGenOptions.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Frontend/CodeGenOptions.def?rev=320920&r1=320919&r2=320920&view=diff
==
--- cfe/trunk/include/clang/Frontend/CodeGenOptions.def (original)
+++ cfe/trunk/include/clang/Frontend/CodeGenOptions.def Sat Dec 16 08:11:17 2017
@@ -117,6 +117,7 @@ CODEGENOPT(EnableSegmentedStacks , 1, 0)
 CODEGENOPT(NoImplicitFloat   , 1, 0) ///< Set when -mno-implicit-float is 
enabled.
 CODEGENOPT(NoInfsFPMath  , 1, 0) ///< Assume FP arguments, results not 
+-Inf.
 CODEGENOPT(NoSignedZeros , 1, 0) ///< Allow ignoring the signedness of FP 
zero
+CODEGENOPT(Reassociate   , 1, 0) ///< Allow reassociation of FP math ops
 CODEGENOPT(ReciprocalMath, 1, 0) ///< Allow FP divisions to be 
reassociated.
 CODEGENOPT(NoTrappingMath, 1, 0) ///< Set when -fno-trapping-math is 
enabled.
 CODEGENOPT(NoNaNsFPMath  , 1, 0) ///< Assume FP arguments, results not NaN.

Modified: cfe/trunk/lib/CodeGen/CodeGenFunction.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenFunction.cpp?rev=320920&r1=320919&r2=320920&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenFunction.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenFunction.cpp Sat Dec 16 08:11:17 2017
@@ -103,6 +103,9 @@ CodeGenFunction::CodeGenFunction(CodeGen
   if (CGM.getCodeGenOpts().ReciprocalMath) {
 FMF.setAllowReciprocal();
   }
+  if (CGM.getCodeGenOpts().Reassociate) {
+FMF.setAllowReassoc();
+  }
   Builder.setFastMathFlags(FMF);
 }
 

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolC

r319619 - [CodeGen] fix mapping from fmod calls to frem instruction

2017-12-02 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sat Dec  2 09:52:00 2017
New Revision: 319619

URL: http://llvm.org/viewvc/llvm-project?rev=319619&view=rev
Log:
[CodeGen] fix mapping from fmod calls to frem instruction

Similar to D40044 and discussed in D40594.

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/math-builtins.c
cfe/trunk/test/CodeGen/math-libcalls.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=319619&r1=319618&r2=319619&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Sat Dec  2 09:52:00 2017
@@ -854,12 +854,11 @@ RValue CodeGenFunction::EmitBuiltinExpr(
Result.Val.getFloat()));
   }
 
-  // Math builtins have the same semantics as their math library twins.
-  // There are LLVM math intrinsics corresponding to math library functions
-  // except the intrinsic will never set errno while the math library might.
-  // Thus, we can transform math library and builtin calls to their
-  // semantically-equivalent LLVM intrinsic counterparts if the call is marked
-  // 'const' (it is known to never set errno).
+  // There are LLVM math intrinsics/instructions corresponding to math library
+  // functions except the LLVM op will never set errno while the math library
+  // might. Also, math builtins have the same semantics as their math library
+  // twins. Thus, we can transform math library and builtin calls to their
+  // LLVM counterparts if the call is marked 'const' (known to never set 
errno).
   if (FD->hasAttr()) {
 switch (BuiltinID) {
 case Builtin::BIceil:
@@ -942,6 +941,19 @@ RValue CodeGenFunction::EmitBuiltinExpr(
 case Builtin::BI__builtin_fminl:
   return RValue::get(emitBinaryBuiltin(*this, E, Intrinsic::minnum));
 
+// fmod() is a special-case. It maps to the frem instruction rather than an
+// LLVM intrinsic.
+case Builtin::BIfmod:
+case Builtin::BIfmodf:
+case Builtin::BIfmodl:
+case Builtin::BI__builtin_fmod:
+case Builtin::BI__builtin_fmodf:
+case Builtin::BI__builtin_fmodl: {
+  Value *Arg1 = EmitScalarExpr(E->getArg(0));
+  Value *Arg2 = EmitScalarExpr(E->getArg(1));
+  return RValue::get(Builder.CreateFRem(Arg1, Arg2, "fmod"));
+}
+
 case Builtin::BIlog:
 case Builtin::BIlogf:
 case Builtin::BIlogl:
@@ -1067,14 +1079,6 @@ RValue CodeGenFunction::EmitBuiltinExpr(
 
 return RValue::get(Result);
   }
-  case Builtin::BI__builtin_fmod:
-  case Builtin::BI__builtin_fmodf:
-  case Builtin::BI__builtin_fmodl: {
-Value *Arg1 = EmitScalarExpr(E->getArg(0));
-Value *Arg2 = EmitScalarExpr(E->getArg(1));
-Value *Result = Builder.CreateFRem(Arg1, Arg2, "fmod");
-return RValue::get(Result);
-  }
   case Builtin::BI__builtin_conj:
   case Builtin::BI__builtin_conjf:
   case Builtin::BI__builtin_conjl: {

Modified: cfe/trunk/test/CodeGen/math-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/math-builtins.c?rev=319619&r1=319618&r2=319619&view=diff
==
--- cfe/trunk/test/CodeGen/math-builtins.c (original)
+++ cfe/trunk/test/CodeGen/math-builtins.c Sat Dec  2 09:52:00 2017
@@ -6,12 +6,21 @@
 // Test attributes and codegen of math builtins.
 
 void foo(double *d, float f, float *fp, long double *l, int *i, const char *c) 
{
+  f = __builtin_fmod(f,f);f = __builtin_fmodf(f,f);   f =  
__builtin_fmodl(f,f);
+
+// NO__ERRNO: frem double
+// NO__ERRNO: frem float
+// NO__ERRNO: frem x86_fp80
+// HAS_ERRNO: declare double @fmod(double, double) [[NOT_READNONE:#[0-9]+]]
+// HAS_ERRNO: declare float @fmodf(float, float) [[NOT_READNONE]]
+// HAS_ERRNO: declare x86_fp80 @fmodl(x86_fp80, x86_fp80) [[NOT_READNONE]]
+
   __builtin_atan2(f,f);__builtin_atan2f(f,f) ;  __builtin_atan2l(f, f);
 
 // NO__ERRNO: declare double @atan2(double, double) [[READNONE:#[0-9]+]]
 // NO__ERRNO: declare float @atan2f(float, float) [[READNONE]]
 // NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80, x86_fp80) [[READNONE]]
-// HAS_ERRNO: declare double @atan2(double, double) [[NOT_READNONE:#[0-9]+]]
+// HAS_ERRNO: declare double @atan2(double, double) [[NOT_READNONE]]
 // HAS_ERRNO: declare float @atan2f(float, float) [[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80, x86_fp80) [[NOT_READNONE]]
 
@@ -33,13 +42,6 @@ void foo(double *d, float f, float *fp,
 // HAS_ERRNO: declare float @llvm.fabs.f32(float) [[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare x86_fp80 @llvm.fabs.f80(x86_fp80) [[READNONE_INTRINSIC]]
 
-  __builtin_fmod(f,f); __builtin_fmodf(f,f);__builtin_fmodl(f,f);
-
-// NO__ERRNO-NOT: .fmod
-// NO__ERRNO-NOT: @fmod
-// HAS_ERRNO-NOT: .fmod
-// HAS_ERRNO-NOT: @fmod
-
   __builtin_frexp(f,i);__builtin_frexpf(f,i);   __builtin_frexp

r319618 - [CodeGen] remove stale comment; NFC

2017-12-02 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sat Dec  2 08:29:34 2017
New Revision: 319618

URL: http://llvm.org/viewvc/llvm-project?rev=319618&view=rev
Log:
[CodeGen] remove stale comment; NFC

The libm functions with LLVM intrinsic twins were moved above this blob with:
https://reviews.llvm.org/rL319593


Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=319618&r1=319617&r2=319618&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Sat Dec  2 08:29:34 2017
@@ -1028,7 +1028,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(
   }
 
   switch (BuiltinID) {
-  default: break;  // Handle intrinsics and libm functions below.
+  default: break;
   case Builtin::BI__builtin___CFStringMakeConstantString:
   case Builtin::BI__builtin___NSStringMakeConstantString:
 return RValue::get(ConstantEmitter(*this).emitAbstract(E, E->getType()));


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r319593 - [CodeGen] convert math libcalls/builtins to equivalent LLVM intrinsics

2017-12-01 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Dec  1 15:15:52 2017
New Revision: 319593

URL: http://llvm.org/viewvc/llvm-project?rev=319593&view=rev
Log:
[CodeGen] convert math libcalls/builtins to equivalent LLVM intrinsics

There are 20 LLVM math intrinsics that correspond to mathlib calls according to 
the LangRef:
http://llvm.org/docs/LangRef.html#standard-c-library-intrinsics

We were only converting 3 mathlib calls (sqrt, fma, pow) and 12 builtin calls 
(ceil, copysign, 
fabs, floor, fma, fmax, fmin, nearbyint, pow, rint, round, trunc) to their 
intrinsic-equivalents.

This patch pulls the transforms together and handles all 20 cases. The switch 
is guarded by a 
check for const-ness to make sure we're not doing the transform if errno could 
possibly be set by
the libcall or builtin.

Differential Revision: https://reviews.llvm.org/D40044

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/builtin-sqrt.c
cfe/trunk/test/CodeGen/builtins.c
cfe/trunk/test/CodeGen/libcalls.c
cfe/trunk/test/CodeGen/math-builtins.c
cfe/trunk/test/CodeGen/math-libcalls.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=319593&r1=319592&r2=319593&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Fri Dec  1 15:15:52 2017
@@ -854,6 +854,179 @@ RValue CodeGenFunction::EmitBuiltinExpr(
Result.Val.getFloat()));
   }
 
+  // Math builtins have the same semantics as their math library twins.
+  // There are LLVM math intrinsics corresponding to math library functions
+  // except the intrinsic will never set errno while the math library might.
+  // Thus, we can transform math library and builtin calls to their
+  // semantically-equivalent LLVM intrinsic counterparts if the call is marked
+  // 'const' (it is known to never set errno).
+  if (FD->hasAttr()) {
+switch (BuiltinID) {
+case Builtin::BIceil:
+case Builtin::BIceilf:
+case Builtin::BIceill:
+case Builtin::BI__builtin_ceil:
+case Builtin::BI__builtin_ceilf:
+case Builtin::BI__builtin_ceill:
+  return RValue::get(emitUnaryBuiltin(*this, E, Intrinsic::ceil));
+
+case Builtin::BIcopysign:
+case Builtin::BIcopysignf:
+case Builtin::BIcopysignl:
+case Builtin::BI__builtin_copysign:
+case Builtin::BI__builtin_copysignf:
+case Builtin::BI__builtin_copysignl:
+  return RValue::get(emitBinaryBuiltin(*this, E, Intrinsic::copysign));
+
+case Builtin::BIcos:
+case Builtin::BIcosf:
+case Builtin::BIcosl:
+case Builtin::BI__builtin_cos:
+case Builtin::BI__builtin_cosf:
+case Builtin::BI__builtin_cosl:
+  return RValue::get(emitUnaryBuiltin(*this, E, Intrinsic::cos));
+
+case Builtin::BIexp:
+case Builtin::BIexpf:
+case Builtin::BIexpl:
+case Builtin::BI__builtin_exp:
+case Builtin::BI__builtin_expf:
+case Builtin::BI__builtin_expl:
+  return RValue::get(emitUnaryBuiltin(*this, E, Intrinsic::exp));
+
+case Builtin::BIexp2:
+case Builtin::BIexp2f:
+case Builtin::BIexp2l:
+case Builtin::BI__builtin_exp2:
+case Builtin::BI__builtin_exp2f:
+case Builtin::BI__builtin_exp2l:
+  return RValue::get(emitUnaryBuiltin(*this, E, Intrinsic::exp2));
+
+case Builtin::BIfabs:
+case Builtin::BIfabsf:
+case Builtin::BIfabsl:
+case Builtin::BI__builtin_fabs:
+case Builtin::BI__builtin_fabsf:
+case Builtin::BI__builtin_fabsl:
+  return RValue::get(emitUnaryBuiltin(*this, E, Intrinsic::fabs));
+
+case Builtin::BIfloor:
+case Builtin::BIfloorf:
+case Builtin::BIfloorl:
+case Builtin::BI__builtin_floor:
+case Builtin::BI__builtin_floorf:
+case Builtin::BI__builtin_floorl:
+  return RValue::get(emitUnaryBuiltin(*this, E, Intrinsic::floor));
+
+case Builtin::BIfma:
+case Builtin::BIfmaf:
+case Builtin::BIfmal:
+case Builtin::BI__builtin_fma:
+case Builtin::BI__builtin_fmaf:
+case Builtin::BI__builtin_fmal:
+  return RValue::get(emitTernaryBuiltin(*this, E, Intrinsic::fma));
+
+case Builtin::BIfmax:
+case Builtin::BIfmaxf:
+case Builtin::BIfmaxl:
+case Builtin::BI__builtin_fmax:
+case Builtin::BI__builtin_fmaxf:
+case Builtin::BI__builtin_fmaxl:
+  return RValue::get(emitBinaryBuiltin(*this, E, Intrinsic::maxnum));
+
+case Builtin::BIfmin:
+case Builtin::BIfminf:
+case Builtin::BIfminl:
+case Builtin::BI__builtin_fmin:
+case Builtin::BI__builtin_fminf:
+case Builtin::BI__builtin_fminl:
+  return RValue::get(emitBinaryBuiltin(*this, E, Intrinsic::minnum));
+
+case Builtin::BIlog:
+case Builtin::BIlogf:
+case Builtin::BIlogl:
+case Builtin::BI__builtin_log:
+case Builtin::BI__builtin_logf:
+case Builtin::BI__builtin_logl:
+  return R

r318598 - [CodeGen] change const-ness of complex calls

2017-11-18 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sat Nov 18 11:31:57 2017
New Revision: 318598

URL: http://llvm.org/viewvc/llvm-project?rev=318598&view=rev
Log:
[CodeGen] change const-ness of complex calls

After clarification about the C standard, POSIX, and implementations:
The C standard allows errno-setting, and it's (unfortunately for optimization) 
even 
more clearly stated in the newer additions to the standards.

We can leave these functions as always constant ('c') because they don't 
actually do any math and therefore won't set errno:
cimag ( http://en.cppreference.com/w/c/numeric/complex/cimag )
creal ( http://en.cppreference.com/w/c/numeric/complex/creal )
cproj ( http://en.cppreference.com/w/c/numeric/complex/cproj )
conj (http://en.cppreference.com/w/c/numeric/complex/conj ) 

Differential Revision: https://reviews.llvm.org/D39611

Modified:
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/test/CodeGen/complex-builtins.c
cfe/trunk/test/CodeGen/complex-libcalls.c
cfe/trunk/test/CodeGen/libcall-declarations.c

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=318598&r1=318597&r2=318598&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Sat Nov 18 11:31:57 2017
@@ -293,72 +293,72 @@ BUILTIN(__builtin_truncf, "ff", "Fnc")
 BUILTIN(__builtin_truncl, "LdLd", "Fnc")
 
 // C99 complex builtins
-BUILTIN(__builtin_cabs, "dXd", "Fnc")
-BUILTIN(__builtin_cabsf, "fXf", "Fnc")
-BUILTIN(__builtin_cabsl, "LdXLd", "Fnc")
-BUILTIN(__builtin_cacos, "XdXd", "Fnc")
-BUILTIN(__builtin_cacosf, "XfXf", "Fnc")
-BUILTIN(__builtin_cacosh, "XdXd", "Fnc")
-BUILTIN(__builtin_cacoshf, "XfXf", "Fnc")
-BUILTIN(__builtin_cacoshl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_cacosl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_carg, "dXd", "Fnc")
-BUILTIN(__builtin_cargf, "fXf", "Fnc")
-BUILTIN(__builtin_cargl, "LdXLd", "Fnc")
-BUILTIN(__builtin_casin, "XdXd", "Fnc")
-BUILTIN(__builtin_casinf, "XfXf", "Fnc")
-BUILTIN(__builtin_casinh, "XdXd", "Fnc")
-BUILTIN(__builtin_casinhf, "XfXf", "Fnc")
-BUILTIN(__builtin_casinhl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_casinl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_catan, "XdXd", "Fnc")
-BUILTIN(__builtin_catanf, "XfXf", "Fnc")
-BUILTIN(__builtin_catanh, "XdXd", "Fnc")
-BUILTIN(__builtin_catanhf, "XfXf", "Fnc")
-BUILTIN(__builtin_catanhl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_catanl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_ccos, "XdXd", "Fnc")
-BUILTIN(__builtin_ccosf, "XfXf", "Fnc")
-BUILTIN(__builtin_ccosl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_ccosh, "XdXd", "Fnc")
-BUILTIN(__builtin_ccoshf, "XfXf", "Fnc")
-BUILTIN(__builtin_ccoshl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_cexp, "XdXd", "Fnc")
-BUILTIN(__builtin_cexpf, "XfXf", "Fnc")
-BUILTIN(__builtin_cexpl, "XLdXLd", "Fnc")
+BUILTIN(__builtin_cabs, "dXd", "Fne")
+BUILTIN(__builtin_cabsf, "fXf", "Fne")
+BUILTIN(__builtin_cabsl, "LdXLd", "Fne")
+BUILTIN(__builtin_cacos, "XdXd", "Fne")
+BUILTIN(__builtin_cacosf, "XfXf", "Fne")
+BUILTIN(__builtin_cacosh, "XdXd", "Fne")
+BUILTIN(__builtin_cacoshf, "XfXf", "Fne")
+BUILTIN(__builtin_cacoshl, "XLdXLd", "Fne")
+BUILTIN(__builtin_cacosl, "XLdXLd", "Fne")
+BUILTIN(__builtin_carg, "dXd", "Fne")
+BUILTIN(__builtin_cargf, "fXf", "Fne")
+BUILTIN(__builtin_cargl, "LdXLd", "Fne")
+BUILTIN(__builtin_casin, "XdXd", "Fne")
+BUILTIN(__builtin_casinf, "XfXf", "Fne")
+BUILTIN(__builtin_casinh, "XdXd", "Fne")
+BUILTIN(__builtin_casinhf, "XfXf", "Fne")
+BUILTIN(__builtin_casinhl, "XLdXLd", "Fne")
+BUILTIN(__builtin_casinl, "XLdXLd", "Fne")
+BUILTIN(__builtin_catan, "XdXd", "Fne")
+BUILTIN(__builtin_catanf, "XfXf", "Fne")
+BUILTIN(__builtin_catanh, "XdXd", "Fne")
+BUILTIN(__builtin_catanhf, "XfXf", "Fne")
+BUILTIN(__builtin_catanhl, "XLdXLd", "Fne")
+BUILTIN(__builtin_catanl, "XLdXLd", "Fne")
+BUILTIN(__builtin_ccos, "XdXd", "Fne")
+BUILTIN(__builtin_ccosf, "XfXf", "Fne")
+BUILTIN(__builtin_ccosl, "XLdXLd", "Fne")
+BUILTIN(__builtin_ccosh, "XdXd", "Fne")
+BUILTIN(__builtin_ccoshf, "XfXf", "Fne")
+BUILTIN(__builtin_ccoshl, "XLdXLd", "Fne")
+BUILTIN(__builtin_cexp, "XdXd", "Fne")
+BUILTIN(__builtin_cexpf, "XfXf", "Fne")
+BUILTIN(__builtin_cexpl, "XLdXLd", "Fne")
 BUILTIN(__builtin_cimag, "dXd", "Fnc")
 BUILTIN(__builtin_cimagf, "fXf", "Fnc")
 BUILTIN(__builtin_cimagl, "LdXLd", "Fnc")
 BUILTIN(__builtin_conj, "XdXd", "Fnc")
 BUILTIN(__builtin_conjf, "XfXf", "Fnc")
 BUILTIN(__builtin_conjl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_clog, "XdXd", "Fnc")
-BUILTIN(__builtin_clogf, "XfXf", "Fnc")
-BUILTIN(__builtin_clogl, "XLdXLd", "Fnc")
+BUILTIN(__builtin_clog, "XdXd", "Fne")
+BUILTIN(__builtin_clogf, "XfXf", "Fne")
+BUILTIN(__builtin_clogl, "XLdXLd", "Fne")
 BUILTIN(__builtin_cproj, "XdXd", "Fnc")
 BUILTIN(__builtin_cprojf, "XfXf", "Fnc")
 BUILTIN(__builtin_cprojl, "XLdXLd", "Fnc")
-BUILTIN(__builtin_cpow, "XdXdXd", "Fnc")
-BUILTIN(

r318093 - [CodeGen] fix const-ness of cbrt and fma

2017-11-13 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Mon Nov 13 14:11:49 2017
New Revision: 318093

URL: http://llvm.org/viewvc/llvm-project?rev=318093&view=rev
Log:
[CodeGen] fix const-ness of cbrt and fma

cbrt() is always constant because it can't overflow or underflow. Therefore, it 
can't set errno.

fma() is not always constant because it can overflow or underflow. Therefore, 
it can set errno.
But we know that it never sets errno on GNU / MSVC, so make it constant in 
those environments.

Differential Revision: https://reviews.llvm.org/D39641

Modified:
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/lib/Sema/SemaDecl.cpp
cfe/trunk/test/CodeGen/libcalls.c
cfe/trunk/test/CodeGen/math-builtins.c
cfe/trunk/test/CodeGen/math-libcalls.c

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=318093&r1=318092&r2=318093&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Mon Nov 13 14:11:49 2017
@@ -165,9 +165,9 @@ BUILTIN(__builtin_atanl, "LdLd", "Fne")
 BUILTIN(__builtin_atanh , "dd", "Fne")
 BUILTIN(__builtin_atanhf, "ff", "Fne")
 BUILTIN(__builtin_atanhl, "LdLd", "Fne")
-BUILTIN(__builtin_cbrt , "dd", "Fne")
-BUILTIN(__builtin_cbrtf, "ff", "Fne")
-BUILTIN(__builtin_cbrtl, "LdLd", "Fne")
+BUILTIN(__builtin_cbrt , "dd", "Fnc")
+BUILTIN(__builtin_cbrtf, "ff", "Fnc")
+BUILTIN(__builtin_cbrtl, "LdLd", "Fnc")
 BUILTIN(__builtin_ceil , "dd"  , "Fnc")
 BUILTIN(__builtin_ceilf, "ff"  , "Fnc")
 BUILTIN(__builtin_ceill, "LdLd", "Fnc")
@@ -1040,9 +1040,9 @@ LIBBUILTIN(atanh, "dd", "fne", "math.h",
 LIBBUILTIN(atanhf, "ff", "fne", "math.h", ALL_LANGUAGES)
 LIBBUILTIN(atanhl, "LdLd", "fne", "math.h", ALL_LANGUAGES)
 
-LIBBUILTIN(cbrt, "dd", "fne", "math.h", ALL_LANGUAGES)
-LIBBUILTIN(cbrtf, "ff", "fne", "math.h", ALL_LANGUAGES)
-LIBBUILTIN(cbrtl, "LdLd", "fne", "math.h", ALL_LANGUAGES)
+LIBBUILTIN(cbrt, "dd", "fnc", "math.h", ALL_LANGUAGES)
+LIBBUILTIN(cbrtf, "ff", "fnc", "math.h", ALL_LANGUAGES)
+LIBBUILTIN(cbrtl, "LdLd", "fnc", "math.h", ALL_LANGUAGES)
 
 LIBBUILTIN(ceil, "dd", "fnc", "math.h", ALL_LANGUAGES)
 LIBBUILTIN(ceilf, "ff", "fnc", "math.h", ALL_LANGUAGES)

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=318093&r1=318092&r2=318093&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Mon Nov 13 14:11:49 2017
@@ -2109,15 +2109,11 @@ RValue CodeGenFunction::EmitBuiltinExpr(
   case Builtin::BIfmal:
   case Builtin::BI__builtin_fma:
   case Builtin::BI__builtin_fmaf:
-  case Builtin::BI__builtin_fmal: {
-// Rewrite fma to intrinsic.
-Value *FirstArg = EmitScalarExpr(E->getArg(0));
-llvm::Type *ArgType = FirstArg->getType();
-Value *F = CGM.getIntrinsic(Intrinsic::fma, ArgType);
-return RValue::get(
-Builder.CreateCall(F, {FirstArg, EmitScalarExpr(E->getArg(1)),
-   EmitScalarExpr(E->getArg(2))}));
-  }
+  case Builtin::BI__builtin_fmal:
+// A constant libcall or builtin is equivalent to the LLVM intrinsic.
+if (FD->hasAttr())
+  return RValue::get(emitTernaryBuiltin(*this, E, Intrinsic::fma));
+break;
 
   case Builtin::BI__builtin_signbit:
   case Builtin::BI__builtin_signbitf:

Modified: cfe/trunk/lib/Sema/SemaDecl.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaDecl.cpp?rev=318093&r1=318092&r2=318093&view=diff
==
--- cfe/trunk/lib/Sema/SemaDecl.cpp (original)
+++ cfe/trunk/lib/Sema/SemaDecl.cpp Mon Nov 13 14:11:49 2017
@@ -12838,15 +12838,33 @@ void Sema::AddKnownFunctionAttributes(Fu
   FD->getLocation()));
 }
 
-// Mark const if we don't care about errno and that is the only
-// thing preventing the function from being const. This allows
-// IRgen to use LLVM intrinsics for such functions.
-if (!getLangOpts().MathErrno &&
-Context.BuiltinInfo.isConstWithoutErrno(BuiltinID)) {
-  if (!FD->hasAttr())
+// Mark const if we don't care about errno and that is the only thing
+// preventing the function from being const. This allows IRgen to use LLVM
+// intrinsics for such functions.
+if (!getLangOpts().MathErrno && !FD->hasAttr() &&
+Context.BuiltinInfo.isConstWithoutErrno(BuiltinID))
+  FD->addAttr(ConstAttr::CreateImplicit(Context, FD->getLocation()));
+
+// We make "fma" on GNU or Windows const because we know it does not set
+// errno in those environments even though it could set errno based on the
+// C standard.
+const llvm::Triple &Trip = Context.getTargetInf

r317489 - [CodeGen] match new fast-math-flag method: isFast()

2017-11-06 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Mon Nov  6 08:27:36 2017
New Revision: 317489

URL: http://llvm.org/viewvc/llvm-project?rev=317489&view=rev
Log:
[CodeGen] match new fast-math-flag method: isFast()

This corresponds to LLVM commiti r317488:

If that commit is reverted, this commit will also need to be reverted.

Modified:
cfe/trunk/lib/CodeGen/CodeGenFunction.cpp

Modified: cfe/trunk/lib/CodeGen/CodeGenFunction.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenFunction.cpp?rev=317489&r1=317488&r2=317489&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenFunction.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenFunction.cpp Mon Nov  6 08:27:36 2017
@@ -87,7 +87,7 @@ CodeGenFunction::CodeGenFunction(CodeGen
 
   llvm::FastMathFlags FMF;
   if (CGM.getLangOpts().FastMath)
-FMF.setUnsafeAlgebra();
+FMF.setFast();
   if (CGM.getLangOpts().FiniteMathOnly) {
 FMF.setNoNaNs();
 FMF.setNoInfs();


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r317407 - [CodeGen] add remquo to list of recognized library calls

2017-11-04 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sat Nov  4 08:03:11 2017
New Revision: 317407

URL: http://llvm.org/viewvc/llvm-project?rev=317407&view=rev
Log:
[CodeGen] add remquo to list of recognized library calls

This is just an oversight because we already do recognize __builtin_remquo()
with the same signature.

http://en.cppreference.com/w/c/numeric/math/remquo
http://pubs.opengroup.org/onlinepubs/9699919799/functions/remquo.html

Differential Revision: https://reviews.llvm.org/D39615

Modified:
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/test/CodeGen/libcalls-errno.c

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=317407&r1=317406&r2=317407&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Sat Nov  4 08:03:11 2017
@@ -1162,6 +1162,10 @@ LIBBUILTIN(remainder, "ddd", "fne", "mat
 LIBBUILTIN(remainderf, "fff", "fne", "math.h", ALL_LANGUAGES)
 LIBBUILTIN(remainderl, "LdLdLd", "fne", "math.h", ALL_LANGUAGES)
 
+LIBBUILTIN(remquo, "dddi*", "fn", "math.h", ALL_LANGUAGES)
+LIBBUILTIN(remquof, "fffi*", "fn", "math.h", ALL_LANGUAGES)
+LIBBUILTIN(remquol, "LdLdLdi*", "fn", "math.h", ALL_LANGUAGES)
+
 LIBBUILTIN(rint, "dd", "fnc", "math.h", ALL_LANGUAGES)
 LIBBUILTIN(rintf, "ff", "fnc", "math.h", ALL_LANGUAGES)
 LIBBUILTIN(rintl, "LdLd", "fnc", "math.h", ALL_LANGUAGES)

Modified: cfe/trunk/test/CodeGen/libcalls-errno.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/libcalls-errno.c?rev=317407&r1=317406&r2=317407&view=diff
==
--- cfe/trunk/test/CodeGen/libcalls-errno.c (original)
+++ cfe/trunk/test/CodeGen/libcalls-errno.c Sat Nov  4 08:03:11 2017
@@ -418,10 +418,14 @@ void foo() {
 // HAS_ERRNO: declare float @remainderf(float, float) [[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @remainderl(x86_fp80, x86_fp80) [[NOT_READNONE]]
 
-//
-// FIXME: remquo is not recognized as a mathlib call.
-//
-  // remquo(f,f,i);  remquof(f,f,i); remquol(f,f,i);
+  remquo(f,f,i);  remquof(f,f,i); remquol(f,f,i);
+
+// NO__ERRNO: declare double @remquo(double, double, i32*) [[NOT_READNONE]]
+// NO__ERRNO: declare float @remquof(float, float, i32*) [[NOT_READNONE]]
+// NO__ERRNO: declare x86_fp80 @remquol(x86_fp80, x86_fp80, i32*) 
[[NOT_READNONE]]
+// HAS_ERRNO: declare double @remquo(double, double, i32*) [[NOT_READNONE]]
+// HAS_ERRNO: declare float @remquof(float, float, i32*) [[NOT_READNONE]]
+// HAS_ERRNO: declare x86_fp80 @remquol(x86_fp80, x86_fp80, i32*) 
[[NOT_READNONE]]
 
   rint(f);   rintf(f);  rintl(f);
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r317336 - [CodeGen] add libcall attr tests to show errno-related diffs; NFC

2017-11-03 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Nov  3 09:27:27 2017
New Revision: 317336

URL: http://llvm.org/viewvc/llvm-project?rev=317336&view=rev
Log:
[CodeGen] add libcall attr tests to show errno-related diffs; NFC

See rL317220 for the builtin siblings.

Added:
cfe/trunk/test/CodeGen/libcalls-errno.c

Added: cfe/trunk/test/CodeGen/libcalls-errno.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/libcalls-errno.c?rev=317336&view=auto
==
--- cfe/trunk/test/CodeGen/libcalls-errno.c (added)
+++ cfe/trunk/test/CodeGen/libcalls-errno.c Fri Nov  3 09:27:27 2017
@@ -0,0 +1,732 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -w -S -o - -emit-llvm
  %s | FileCheck %s -check-prefix=NO__ERRNO
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -w -S -o - -emit-llvm 
-fmath-errno %s | FileCheck %s -check-prefix=HAS_ERRNO
+
+// Test attributes of library calls to see how errno affects the resulting 
codegen. 
+
+
+double *d;
+float f;
+float *fp;
+long double *l;
+int *i;
+const char *c;
+
+void foo() {
+  atan2(f,f);atan2f(f,f) ;  atan2l(f, f);
+
+// NO__ERRNO: declare double @atan2(double, double) [[READNONE:#[0-9]+]]
+// NO__ERRNO: declare float @atan2f(float, float) [[READNONE]]
+// NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80, x86_fp80) [[READNONE]]
+// HAS_ERRNO: declare double @atan2(double, double) [[NOT_READNONE:#[0-9]+]]
+// HAS_ERRNO: declare float @atan2f(float, float) [[NOT_READNONE]]
+// HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80, x86_fp80) [[NOT_READNONE]]
+
+  copysign(f,f); copysignf(f,f);copysignl(f,f);
+
+// NO__ERRNO: declare double @copysign(double, double) [[READNONE]]
+// NO__ERRNO: declare float @copysignf(float, float) [[READNONE]]
+// NO__ERRNO: declare x86_fp80 @copysignl(x86_fp80, x86_fp80) [[READNONE]]
+// HAS_ERRNO: declare double @copysign(double, double) [[READNONE:#[0-9]+]]
+// HAS_ERRNO: declare float @copysignf(float, float) [[READNONE]]
+// HAS_ERRNO: declare x86_fp80 @copysignl(x86_fp80, x86_fp80) [[READNONE]]
+
+  fabs(f);   fabsf(f);  fabsl(f);
+
+// NO__ERRNO: declare double @fabs(double) [[READNONE]]
+// NO__ERRNO: declare float @fabsf(float) [[READNONE]]
+// NO__ERRNO: declare x86_fp80 @fabsl(x86_fp80) [[READNONE]]
+// HAS_ERRNO: declare double @fabs(double) [[READNONE]]
+// HAS_ERRNO: declare float @fabsf(float) [[READNONE]]
+// HAS_ERRNO: declare x86_fp80 @fabsl(x86_fp80) [[READNONE]]
+
+  fmod(f,f); fmodf(f,f);fmodl(f,f);
+
+// NO__ERRNO: declare double @fmod(double, double) [[READNONE]]
+// NO__ERRNO: declare float @fmodf(float, float) [[READNONE]]
+// NO__ERRNO: declare x86_fp80 @fmodl(x86_fp80, x86_fp80) [[READNONE]]
+// HAS_ERRNO: declare double @fmod(double, double) [[NOT_READNONE]]
+// HAS_ERRNO: declare float @fmodf(float, float) [[NOT_READNONE]]
+// HAS_ERRNO: declare x86_fp80 @fmodl(x86_fp80, x86_fp80) [[NOT_READNONE]]
+
+  frexp(f,i);frexpf(f,i);   frexpl(f,i);
+
+// NO__ERRNO: declare double @frexp(double, i32*) [[NOT_READNONE:#[0-9]+]]
+// NO__ERRNO: declare float @frexpf(float, i32*) [[NOT_READNONE]]
+// NO__ERRNO: declare x86_fp80 @frexpl(x86_fp80, i32*) [[NOT_READNONE]]
+// HAS_ERRNO: declare double @frexp(double, i32*) [[NOT_READNONE]]
+// HAS_ERRNO: declare float @frexpf(float, i32*) [[NOT_READNONE]]
+// HAS_ERRNO: declare x86_fp80 @frexpl(x86_fp80, i32*) [[NOT_READNONE]]
+
+  ldexp(f,f);ldexpf(f,f);   ldexpl(f,f);  
+
+// NO__ERRNO: declare double @ldexp(double, i32) [[READNONE]]
+// NO__ERRNO: declare float @ldexpf(float, i32) [[READNONE]]
+// NO__ERRNO: declare x86_fp80 @ldexpl(x86_fp80, i32) [[READNONE]]
+// HAS_ERRNO: declare double @ldexp(double, i32) [[NOT_READNONE]]
+// HAS_ERRNO: declare float @ldexpf(float, i32) [[NOT_READNONE]]
+// HAS_ERRNO: declare x86_fp80 @ldexpl(x86_fp80, i32) [[NOT_READNONE]]
+
+  modf(f,d);   modff(f,fp);  modfl(f,l); 
+
+// NO__ERRNO: declare double @modf(double, double*) [[NOT_READNONE]]
+// NO__ERRNO: declare float @modff(float, float*) [[NOT_READNONE]]
+// NO__ERRNO: declare x86_fp80 @modfl(x86_fp80, x86_fp80*) [[NOT_READNONE]]
+// HAS_ERRNO: declare double @modf(double, double*) [[NOT_READNONE]]
+// HAS_ERRNO: declare float @modff(float, float*) [[NOT_READNONE]]
+// HAS_ERRNO: declare x86_fp80 @modfl(x86_fp80, x86_fp80*) [[NOT_READNONE]]
+
+  nan(c);nanf(c);   nanl(c);  
+
+// NO__ERRNO: declare double @nan(i8*) [[READONLY:#[0-9]+]]
+// NO__ERRNO: declare float @nanf(i8*) [[READONLY]]
+// NO__ERRNO: declare x86_fp80 @nanl(i8*) [[READONLY]]
+// HAS_ERRNO: declare double @nan(i8*) [[READONLY:#[0-9]+]]
+// HAS_ERRNO: declare float @nanf(i8*) [[READONLY]]
+// HAS_ERRNO: declare x86_fp80 @nanl(i8*) [[READONLY]]
+
+  pow(f,f);powf(f,f);   powl(f,f);
+
+// NO__ERRNO: declare double @llvm.pow.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare float @llvm.pow.f32(float, float) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.pow.f80(x8

r317265 - [CodeGen] fix const-ness of builtin equivalents of and functions that might set errno

2017-11-02 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Thu Nov  2 13:39:26 2017
New Revision: 317265

URL: http://llvm.org/viewvc/llvm-project?rev=317265&view=rev
Log:
[CodeGen] fix const-ness of builtin equivalents of  and  
functions that might set errno

This just makes const-ness of the builtins match const-ness of their lib 
function siblings. 
We're deferring fixing some of these that are obviously wrong to follow-up 
patches. 
Hopefully, the bugs are visible in the new test file (added at rL317220).

As the description in Builtins.def says: "e = const, but only when 
-fmath-errno=0".

This is step 2 of N to fix builtins and math calls as discussed in D39204.

Differential Revision: https://reviews.llvm.org/D39481

Modified:
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/test/CodeGen/builtin-errno.c
cfe/trunk/test/CodeGen/builtin-sqrt.c

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=317265&r1=317264&r2=317265&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Thu Nov  2 13:39:26 2017
@@ -103,9 +103,9 @@
 #endif
 
 // Standard libc/libm functions:
-BUILTIN(__builtin_atan2 , "ddd"  , "Fnc")
-BUILTIN(__builtin_atan2f, "fff"  , "Fnc")
-BUILTIN(__builtin_atan2l, "LdLdLd", "Fnc")
+BUILTIN(__builtin_atan2 , "ddd"  , "Fne")
+BUILTIN(__builtin_atan2f, "fff"  , "Fne")
+BUILTIN(__builtin_atan2l, "LdLdLd", "Fne")
 BUILTIN(__builtin_abs  , "ii"  , "ncF")
 BUILTIN(__builtin_copysign, "ddd", "ncF")
 BUILTIN(__builtin_copysignf, "fff", "ncF")
@@ -113,9 +113,9 @@ BUILTIN(__builtin_copysignl, "LdLdLd", "
 BUILTIN(__builtin_fabs , "dd"  , "ncF")
 BUILTIN(__builtin_fabsf, "ff"  , "ncF")
 BUILTIN(__builtin_fabsl, "LdLd", "ncF")
-BUILTIN(__builtin_fmod , "ddd"  , "Fnc")
-BUILTIN(__builtin_fmodf, "fff"  , "Fnc")
-BUILTIN(__builtin_fmodl, "LdLdLd", "Fnc")
+BUILTIN(__builtin_fmod , "ddd"  , "Fne")
+BUILTIN(__builtin_fmodf, "fff"  , "Fne")
+BUILTIN(__builtin_fmodl, "LdLdLd", "Fne")
 BUILTIN(__builtin_frexp , "ddi*"  , "Fn")
 BUILTIN(__builtin_frexpf, "ffi*"  , "Fn")
 BUILTIN(__builtin_frexpl, "LdLdi*", "Fn")
@@ -127,9 +127,9 @@ BUILTIN(__builtin_inff , "f"   , "nc")
 BUILTIN(__builtin_infl , "Ld"  , "nc")
 BUILTIN(__builtin_labs , "LiLi"  , "Fnc")
 BUILTIN(__builtin_llabs, "LLiLLi", "Fnc")
-BUILTIN(__builtin_ldexp , "ddi"  , "Fnc")
-BUILTIN(__builtin_ldexpf, "ffi"  , "Fnc")
-BUILTIN(__builtin_ldexpl, "LdLdi", "Fnc")
+BUILTIN(__builtin_ldexp , "ddi"  , "Fne")
+BUILTIN(__builtin_ldexpf, "ffi"  , "Fne")
+BUILTIN(__builtin_ldexpl, "LdLdi", "Fne")
 BUILTIN(__builtin_modf , "ddd*"  , "Fn")
 BUILTIN(__builtin_modff, "fff*"  , "Fn")
 BUILTIN(__builtin_modfl, "LdLdLd*", "Fn")
@@ -142,119 +142,119 @@ BUILTIN(__builtin_nansl, "LdcC*", "ncF")
 BUILTIN(__builtin_powi , "ddi"  , "Fnc")
 BUILTIN(__builtin_powif, "ffi"  , "Fnc")
 BUILTIN(__builtin_powil, "LdLdi", "Fnc")
-BUILTIN(__builtin_pow , "ddd"  , "Fnc")
-BUILTIN(__builtin_powf, "fff"  , "Fnc")
-BUILTIN(__builtin_powl, "LdLdLd", "Fnc")
+BUILTIN(__builtin_pow , "ddd"  , "Fne")
+BUILTIN(__builtin_powf, "fff"  , "Fne")
+BUILTIN(__builtin_powl, "LdLdLd", "Fne")
 
 // Standard unary libc/libm functions with double/float/long double variants:
-BUILTIN(__builtin_acos , "dd"  , "Fnc")
-BUILTIN(__builtin_acosf, "ff"  , "Fnc")
-BUILTIN(__builtin_acosl, "LdLd", "Fnc")
-BUILTIN(__builtin_acosh , "dd"  , "Fnc")
-BUILTIN(__builtin_acoshf, "ff"  , "Fnc")
-BUILTIN(__builtin_acoshl, "LdLd", "Fnc")
-BUILTIN(__builtin_asin , "dd"  , "Fnc")
-BUILTIN(__builtin_asinf, "ff"  , "Fnc")
-BUILTIN(__builtin_asinl, "LdLd", "Fnc")
-BUILTIN(__builtin_asinh , "dd"  , "Fnc")
-BUILTIN(__builtin_asinhf, "ff"  , "Fnc")
-BUILTIN(__builtin_asinhl, "LdLd", "Fnc")
-BUILTIN(__builtin_atan , "dd"  , "Fnc")
-BUILTIN(__builtin_atanf, "ff"  , "Fnc")
-BUILTIN(__builtin_atanl, "LdLd", "Fnc")
-BUILTIN(__builtin_atanh , "dd", "Fnc")
-BUILTIN(__builtin_atanhf, "ff", "Fnc")
-BUILTIN(__builtin_atanhl, "LdLd", "Fnc")
-BUILTIN(__builtin_cbrt , "dd", "Fnc")
-BUILTIN(__builtin_cbrtf, "ff", "Fnc")
-BUILTIN(__builtin_cbrtl, "LdLd", "Fnc")
+BUILTIN(__builtin_acos , "dd"  , "Fne")
+BUILTIN(__builtin_acosf, "ff"  , "Fne")
+BUILTIN(__builtin_acosl, "LdLd", "Fne")
+BUILTIN(__builtin_acosh , "dd"  , "Fne")
+BUILTIN(__builtin_acoshf, "ff"  , "Fne")
+BUILTIN(__builtin_acoshl, "LdLd", "Fne")
+BUILTIN(__builtin_asin , "dd"  , "Fne")
+BUILTIN(__builtin_asinf, "ff"  , "Fne")
+BUILTIN(__builtin_asinl, "LdLd", "Fne")
+BUILTIN(__builtin_asinh , "dd"  , "Fne")
+BUILTIN(__builtin_asinhf, "ff"  , "Fne")
+BUILTIN(__builtin_asinhl, "LdLd", "Fne")
+BUILTIN(__builtin_atan , "dd"  , "Fne")
+BUILTIN(__builtin_atanf, "ff"  , "Fne")
+BUILTIN(__builtin_atanl, "LdLd", "Fne")
+BUILTIN(__builtin_atanh , "dd", "Fne")
+BUILTIN(__builtin_atanhf, "ff", "Fne")
+BUILTIN(__builtin_atanhl, "LdLd", "Fne")
+BUILTIN(__builtin_cbrt , "dd", "Fne")
+BUILTIN(__builtin

r317220 - [CodeGen] add builtin attr tests to show errno-related diffs; NFC

2017-11-02 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Thu Nov  2 10:06:05 2017
New Revision: 317220

URL: http://llvm.org/viewvc/llvm-project?rev=317220&view=rev
Log:
[CodeGen] add builtin attr tests to show errno-related diffs; NFC

Added:
cfe/trunk/test/CodeGen/builtin-errno.c

Added: cfe/trunk/test/CodeGen/builtin-errno.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/builtin-errno.c?rev=317220&view=auto
==
--- cfe/trunk/test/CodeGen/builtin-errno.c (added)
+++ cfe/trunk/test/CodeGen/builtin-errno.c Thu Nov  2 10:06:05 2017
@@ -0,0 +1,777 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -w -S -o - -emit-llvm
  %s | FileCheck %s -check-prefix=NO__ERRNO
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -w -S -o - -emit-llvm 
-fmath-errno %s | FileCheck %s -check-prefix=HAS_ERRNO
+
+// Test math, complex, and other builtin attributes to see how errno affects 
the resulting codegen. 
+
+
+double *d;
+float f;
+float *fp;
+long double *l;
+int *i;
+const char *c;
+
+void foo() {
+  __builtin_abs(f);   __builtin_labs(f);  __builtin_llabs(f);
+
+// NO__ERRNO-NOT: .abs
+// NO__ERRNO-NOT: .labs
+// NO__ERRNO-NOT: .llabs
+// NO__ERRNO-NOT: @abs
+// NO__ERRNO-NOT: @labs
+// NO__ERRNO-NOT: @llabs
+// HAS_ERRNO-NOT: .abs
+// HAS_ERRNO-NOT: .labs
+// HAS_ERRNO-NOT: .llabs
+// HAS_ERRNO-NOT: @abs
+// HAS_ERRNO-NOT: @labs
+// HAS_ERRNO-NOT: @llabs
+
+  __builtin_atan2(f,f);__builtin_atan2f(f,f) ;  __builtin_atan2l(f, f);
+
+// NO__ERRNO: declare double @atan2(double, double) [[READNONE:#[0-9]+]]
+// NO__ERRNO: declare float @atan2f(float, float) [[READNONE]]
+// NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80, x86_fp80) [[READNONE]]
+// HAS_ERRNO: declare double @atan2(double, double) [[READNONE:#[0-9]+]]
+// HAS_ERRNO: declare float @atan2f(float, float) [[READNONE]]
+// HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80, x86_fp80) [[READNONE]]
+
+  __builtin_copysign(f,f); __builtin_copysignf(f,f);__builtin_copysignl(f,f);
+
+// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare float @llvm.copysign.f32(float, float) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.copysign.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
+// HAS_ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// HAS_ERRNO: declare float @llvm.copysign.f32(float, float) 
[[READNONE_INTRINSIC]]
+// HAS_ERRNO: declare x86_fp80 @llvm.copysign.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
+
+  __builtin_fabs(f);   __builtin_fabsf(f);  __builtin_fabsl(f);
+
+// NO__ERRNO: declare double @llvm.fabs.f64(double) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare float @llvm.fabs.f32(float) [[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.fabs.f80(x86_fp80) [[READNONE_INTRINSIC]]
+// HAS_ERRNO: declare double @llvm.fabs.f64(double) [[READNONE_INTRINSIC]]
+// HAS_ERRNO: declare float @llvm.fabs.f32(float) [[READNONE_INTRINSIC]]
+// HAS_ERRNO: declare x86_fp80 @llvm.fabs.f80(x86_fp80) [[READNONE_INTRINSIC]]
+
+  __builtin_fmod(f,f); __builtin_fmodf(f,f);__builtin_fmodl(f,f);
+
+// NO__ERRNO-NOT: .fmod
+// NO__ERRNO-NOT: @fmod
+// HAS_ERRNO-NOT: .fmod
+// HAS_ERRNO-NOT: @fmod
+
+  __builtin_frexp(f,i);__builtin_frexpf(f,i);   __builtin_frexpl(f,i);
+
+// NO__ERRNO: declare double @frexp(double, i32*) [[NOT_READNONE:#[0-9]+]]
+// NO__ERRNO: declare float @frexpf(float, i32*) [[NOT_READNONE]]
+// NO__ERRNO: declare x86_fp80 @frexpl(x86_fp80, i32*) [[NOT_READNONE]]
+// HAS_ERRNO: declare double @frexp(double, i32*) [[NOT_READNONE:#[0-9]+]]
+// HAS_ERRNO: declare float @frexpf(float, i32*) [[NOT_READNONE]]
+// HAS_ERRNO: declare x86_fp80 @frexpl(x86_fp80, i32*) [[NOT_READNONE]]
+
+  __builtin_huge_val();__builtin_huge_valf();   __builtin_huge_vall();
+
+// NO__ERRNO-NOT: .huge
+// NO__ERRNO-NOT: @huge
+// HAS_ERRNO-NOT: .huge
+// HAS_ERRNO-NOT: @huge
+
+  __builtin_inf();__builtin_inff();   __builtin_infl();
+
+// NO__ERRNO-NOT: .inf
+// NO__ERRNO-NOT: @inf
+// HAS_ERRNO-NOT: .inf
+// HAS_ERRNO-NOT: @inf
+
+  __builtin_ldexp(f,f);__builtin_ldexpf(f,f);   __builtin_ldexpl(f,f);  
+
+// NO__ERRNO: declare double @ldexp(double, i32) [[READNONE]]
+// NO__ERRNO: declare float @ldexpf(float, i32) [[READNONE]]
+// NO__ERRNO: declare x86_fp80 @ldexpl(x86_fp80, i32) [[READNONE]]
+// HAS_ERRNO: declare double @ldexp(double, i32) [[READNONE]]
+// HAS_ERRNO: declare float @ldexpf(float, i32) [[READNONE]]
+// HAS_ERRNO: declare x86_fp80 @ldexpl(x86_fp80, i32) [[READNONE]]
+
+  __builtin_modf(f,d);   __builtin_modff(f,fp);  __builtin_modfl(f,l); 
+
+// NO__ERRNO: declare double @modf(double, double*) [[NOT_READNONE]]
+// NO__ERRNO: declare float @modff(float, float*) [[NOT_READNONE]]
+// NO__ERRNO: declare x86_fp80 @modfl(x86_fp80, x86_fp80*) [[NOT_READNONE]]
+// HAS_ERRNO: declare double @modf(double, double*) [[NOT_READNONE]]
+/

r317031 - [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set

2017-10-31 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Tue Oct 31 13:19:39 2017
New Revision: 317031

URL: http://llvm.org/viewvc/llvm-project?rev=317031&view=rev
Log:
[CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set

The LLVM sqrt intrinsic definition changed with:
D28797
...so we don't have to use any relaxed FP settings other than errno handling.

This patch sidesteps a question raised in PR27435:
https://bugs.llvm.org/show_bug.cgi?id=27435

Is a programmer using __builtin_sqrt() invoking the compiler's intrinsic 
definition of sqrt or the mathlib definition of sqrt?

But we have an answer now: the builtin should match the behavior of the libm 
function including errno handling.

Differential Revision: https://reviews.llvm.org/D39204

Added:
cfe/trunk/test/CodeGen/builtin-sqrt.c
  - copied, changed from r317030, 
cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c
Removed:
cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c
Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/libcalls.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=317031&r1=317030&r2=317031&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Tue Oct 31 13:19:39 2017
@@ -2072,24 +2072,21 @@ RValue CodeGenFunction::EmitBuiltinExpr(
 return RValue::get(nullptr);
   }
 
-// Library functions with special handling.
   case Builtin::BIsqrt:
   case Builtin::BIsqrtf:
-  case Builtin::BIsqrtl: {
-// Transform a call to sqrt* into a @llvm.sqrt.* intrinsic call, but only
-// in finite- or unsafe-math mode (the intrinsic has different semantics
-// for handling negative numbers compared to the library function, so
-// -fmath-errno=0 is not enough).
-if (!FD->hasAttr())
-  break;
-if (!(CGM.getCodeGenOpts().UnsafeFPMath ||
-  CGM.getCodeGenOpts().NoNaNsFPMath))
-  break;
-Value *Arg0 = EmitScalarExpr(E->getArg(0));
-llvm::Type *ArgType = Arg0->getType();
-Value *F = CGM.getIntrinsic(Intrinsic::sqrt, ArgType);
-return RValue::get(Builder.CreateCall(F, Arg0));
-  }
+  case Builtin::BIsqrtl:
+// Builtins have the same semantics as library functions. The LLVM 
intrinsic
+// has the same semantics as the library function except it does not set
+// errno. Thus, we can transform either sqrt or __builtin_sqrt to 
@llvm.sqrt
+// if the call is 'const' (the call must not set errno).
+//
+// FIXME: The builtin cases are not here because they are marked 'const' in
+// Builtins.def. So that means they are wrongly defined to have different
+// semantics than the library functions. If we included them here, we would
+// turn them into LLVM intrinsics regardless of whether -fmath-errno was 
on.
+if (FD->hasAttr())
+  return RValue::get(emitUnaryBuiltin(*this, E, Intrinsic::sqrt));
+break;
 
   case Builtin::BI__builtin_pow:
   case Builtin::BI__builtin_powf:

Removed: cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c?rev=317030&view=auto
==
--- cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c (original)
+++ cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c (removed)
@@ -1,10 +0,0 @@
-// RUN: %clang_cc1 -triple x86_64-apple-darwin %s -emit-llvm -o - | FileCheck 
%s
-// llvm.sqrt has undefined behavior on negative inputs, so it is
-// inappropriate to translate C/C++ sqrt to this.
-float sqrtf(float x);
-float foo(float X) {
-  // CHECK: foo
-  // CHECK: call float @sqrtf(float %
-  // Check that this is marked readonly when errno is ignored.
-  return sqrtf(X);
-}

Copied: cfe/trunk/test/CodeGen/builtin-sqrt.c (from r317030, 
cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c)
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/builtin-sqrt.c?p2=cfe/trunk/test/CodeGen/builtin-sqrt.c&p1=cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c&r1=317030&r2=317031&rev=317031&view=diff
==
--- cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c (original)
+++ cfe/trunk/test/CodeGen/builtin-sqrt.c Tue Oct 31 13:19:39 2017
@@ -1,10 +1,19 @@
-// RUN: %clang_cc1 -triple x86_64-apple-darwin %s -emit-llvm -o - | FileCheck 
%s
-// llvm.sqrt has undefined behavior on negative inputs, so it is
-// inappropriate to translate C/C++ sqrt to this.
-float sqrtf(float x);
+// RUN: %clang_cc1 -fmath-errno -triple x86_64-apple-darwin %s -emit-llvm -o - 
| FileCheck %s --check-prefix=HAS_ERRNO
+// RUN: %clang_cc1  -triple x86_64-apple-darwin %s -emit-llvm -o - 
| FileCheck %s --check-prefix=NO_ERRNO
+
+// FIXME: If a builtin is supposed to have identical semantics to its libm 
twin, then it
+// should not be marked "cons

r316250 - [CodeGen] add tests for __builtin_sqrt*; NFC

2017-10-20 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Oct 20 16:32:41 2017
New Revision: 316250

URL: http://llvm.org/viewvc/llvm-project?rev=316250&view=rev
Log:
[CodeGen] add tests for __builtin_sqrt*; NFC

I don't know if this is correct, but this is what we currently do.
More discussion in PR27108 and PR27435 and D27618.

Modified:
cfe/trunk/test/CodeGen/builtins.c

Modified: cfe/trunk/test/CodeGen/builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/builtins.c?rev=316250&r1=316249&r2=316250&view=diff
==
--- cfe/trunk/test/CodeGen/builtins.c (original)
+++ cfe/trunk/test/CodeGen/builtins.c Fri Oct 20 16:32:41 2017
@@ -317,6 +317,15 @@ void test_float_builtin_ops(float F, dou
   resld = __builtin_floorl(LD);
   // CHECK: call x86_fp80 @llvm.floor.f80
 
+  resf = __builtin_sqrtf(F);
+  // CHECK: call float @sqrtf(
+
+  resd = __builtin_sqrt(D);
+  // CHECK: call double @sqrt(
+
+  resld = __builtin_sqrtl(LD);
+  // CHECK: call x86_fp80 @sqrtl(
+
   resf = __builtin_truncf(F);
   // CHECK: call float @llvm.trunc.f32
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r314159 - [x86] make assertions less strict in avx512f test file

2017-09-25 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Mon Sep 25 14:31:08 2017
New Revision: 314159

URL: http://llvm.org/viewvc/llvm-project?rev=314159&view=rev
Log:
[x86] make assertions less strict in avx512f test file

Missed a line in r314158.

Modified:
cfe/trunk/test/CodeGen/avx512f-builtins.c

Modified: cfe/trunk/test/CodeGen/avx512f-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx512f-builtins.c?rev=314159&r1=314158&r2=314159&view=diff
==
--- cfe/trunk/test/CodeGen/avx512f-builtins.c (original)
+++ cfe/trunk/test/CodeGen/avx512f-builtins.c Mon Sep 25 14:31:08 2017
@@ -8351,7 +8351,7 @@ __m128d test_mm_maskz_move_sd (__mmask8
   // CHECK-LABEL: @test_mm_maskz_move_sd
   // CHECK:  extractelement <2 x double> %{{.*}}, i32 0
   // CHECK:  phi double [ %{{.*}}, %{{.*}} ], [ 0.00e+00, %{{.*}} ]
-  // CHECK:  insertelement <2 x double> %6, double %cond.i, i32 0
+  // CHECK:  insertelement <2 x double> %{{.*}}, double %{{.*}}, i32 0
   return _mm_maskz_move_sd (__U, __A, __B);
 }
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r314158 - [x86] make assertions less strict in avx512f test file

2017-09-25 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Mon Sep 25 14:27:37 2017
New Revision: 314158

URL: http://llvm.org/viewvc/llvm-project?rev=314158&view=rev
Log:
[x86] make assertions less strict in avx512f test file

I'm not sure why yet, but there may be differences depending on the host?

Modified:
cfe/trunk/test/CodeGen/avx512f-builtins.c

Modified: cfe/trunk/test/CodeGen/avx512f-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx512f-builtins.c?rev=314158&r1=314157&r2=314158&view=diff
==
--- cfe/trunk/test/CodeGen/avx512f-builtins.c (original)
+++ cfe/trunk/test/CodeGen/avx512f-builtins.c Mon Sep 25 14:27:37 2017
@@ -8306,52 +8306,52 @@ __m512d test_mm512_setzero_pd()
 __mmask16 test_mm512_int2mask(int __a)
 {
   // CHECK-LABEL: test_mm512_int2mask
-  // CHECK: trunc i32 %1 to i16
+  // CHECK: trunc i32 %{{.*}} to i16
   return _mm512_int2mask(__a);
 }
 
 int test_mm512_mask2int(__mmask16 __a)
 {
   // CHECK-LABEL: test_mm512_mask2int
-  // CHECK: zext i16 %1 to i32
+  // CHECK: zext i16 %{{.*}} to i32
   return _mm512_mask2int(__a);
 }
 
 __m128 test_mm_mask_move_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B)
 {
   // CHECK-LABEL: @test_mm_mask_move_ss
-  // CHECK: %vecext.i = extractelement <4 x float> %6, i32 0
-  // CHECK: %vecext1.i = extractelement <4 x float> %7, i32 0
-  // CHECK: %cond.i = phi float [ %vecext.i, %cond.true.i ], [ %vecext1.i, 
%cond.false.i ]
-  // CHECK: %vecins.i = insertelement <4 x float> %8, float %cond.i, i32 0
+  // CHECK:  extractelement <4 x float> %{{.*}}, i32 0
+  // CHECK:  extractelement <4 x float> %{{.*}}, i32 0
+  // CHECK:  phi float [ %{{.*}}, %{{.*}} ], [ %{{.*}}, %{{.*}} ]
+  // CHECK:  insertelement <4 x float> %{{.*}}, float %cond.i, i32 0
   return _mm_mask_move_ss ( __W,  __U,  __A,  __B);
 }
 
 __m128 test_mm_maskz_move_ss (__mmask8 __U, __m128 __A, __m128 __B)
 {
   // CHECK-LABEL: @test_mm_maskz_move_ss
-  // CHECK: %vecext.i = extractelement <4 x float> %5, i32 0
-  // CHECK: %cond.i = phi float [ %vecext.i, %cond.true.i ], [ 0.00e+00, 
%cond.false.i ]
-  // CHECK: %vecins.i = insertelement <4 x float> %6, float %cond.i, i32 0
+  // CHECK:  extractelement <4 x float> %{{.*}}, i32 0
+  // CHECK:  phi float [ %{{.*}}, %{{.*}} ], [ 0.00e+00, %{{.*}} ]
+  // CHECK:  insertelement <4 x float> %{{.*}}, float %{{.*}}, i32 0
   return _mm_maskz_move_ss (__U, __A, __B);
 }
 
 __m128d test_mm_mask_move_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d 
__B)
 {
   // CHECK-LABEL: @test_mm_mask_move_sd
-  // CHECK: %vecext.i = extractelement <2 x double> %6, i32 0
-  // CHECK: %vecext1.i = extractelement <2 x double> %7, i32 0
-  // CHECK: %cond.i = phi double [ %vecext.i, %cond.true.i ], [ %vecext1.i, 
%cond.false.i ]
-  // CHECK: %vecins.i = insertelement <2 x double> %8, double %cond.i, i32 0
+  // CHECK:  extractelement <2 x double> %{{.*}}, i32 0
+  // CHECK:  extractelement <2 x double> %{{.*}}, i32 0
+  // CHECK:  phi double [ %{{.*}}, %{{.*}} ], [ %{{.*}}, %{{.*}} ]
+  // CHECK:  insertelement <2 x double> %{{.*}}, double %{{.*}}, i32 0
   return _mm_mask_move_sd ( __W,  __U,  __A,  __B);
 }
 
 __m128d test_mm_maskz_move_sd (__mmask8 __U, __m128d __A, __m128d __B)
 {
   // CHECK-LABEL: @test_mm_maskz_move_sd
-  // CHECK: %vecext.i = extractelement <2 x double> %5, i32 0
-  // CHECK: %cond.i = phi double [ %vecext.i, %cond.true.i ], [ 0.00e+00, 
%cond.false.i ]
-  // CHECK: %vecins.i = insertelement <2 x double> %6, double %cond.i, i32 0
+  // CHECK:  extractelement <2 x double> %{{.*}}, i32 0
+  // CHECK:  phi double [ %{{.*}}, %{{.*}} ], [ 0.00e+00, %{{.*}} ]
+  // CHECK:  insertelement <2 x double> %6, double %cond.i, i32 0
   return _mm_maskz_move_sd (__U, __A, __B);
 }
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r306433 - [x86] weaken test checks that shouldn't be here in the first place

2017-06-27 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Tue Jun 27 10:39:46 2017
New Revision: 306433

URL: http://llvm.org/viewvc/llvm-project?rev=306433&view=rev
Log:
[x86] weaken test checks that shouldn't be here in the first place

This test would fail after the proposed change in:
https://reviews.llvm.org/D34242

Modified:
cfe/trunk/test/CodeGen/avx512f-builtins.c

Modified: cfe/trunk/test/CodeGen/avx512f-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx512f-builtins.c?rev=306433&r1=306432&r2=306433&view=diff
==
--- cfe/trunk/test/CodeGen/avx512f-builtins.c (original)
+++ cfe/trunk/test/CodeGen/avx512f-builtins.c Tue Jun 27 10:39:46 2017
@@ -1,4 +1,7 @@
 // RUN: %clang_cc1 -ffreestanding %s -triple=x86_64-apple-darwin 
-target-feature +avx512f -emit-llvm -o - -Wall -Werror | FileCheck %s
+
+// FIXME: It's wrong to check LLVM IR transformations from clang. This run 
should be removed and tests added to the appropriate LLVM pass.
+
 // RUN: %clang_cc1 -ffreestanding %s -triple=x86_64-apple-darwin 
-target-feature +avx512f -O2 -emit-llvm -o - -Wall -Werror | FileCheck %s 
-check-prefix=O2
 
 #include 
@@ -8240,10 +8243,10 @@ __m128 test_mm_mask_move_ss (__m128 __W,
 {
   // O2-LABEL: @test_mm_mask_move_ss
   // O2: %[[M:.*]] = and i8 %__U, 1
-  // O2: %[[M2:.*]] = icmp ne i8 %[[M]], 0
-  // O2: %[[ELM1:.*]] = extractelement <4 x float> %__B, i32 0
-  // O2: %[[ELM2:.*]] = extractelement <4 x float> %__W, i32 0
-  // O2: %[[SEL:.*]] = select i1 %[[M2]], float %[[ELM1]], float %[[ELM2]]
+  // O2: %[[M2:.*]] = icmp 
+  // O2: %[[ELM1:.*]] = extractelement <4 x float> 
+  // O2: %[[ELM2:.*]] = extractelement <4 x float> 
+  // O2: %[[SEL:.*]] = select i1 %[[M2]]
   // O2: %[[RES:.*]] = insertelement <4 x float> %__A, float %[[SEL]], i32 0
   // O2: ret <4 x float> %[[RES]]
   return _mm_mask_move_ss ( __W,  __U,  __A,  __B);
@@ -8253,9 +8256,9 @@ __m128 test_mm_maskz_move_ss (__mmask8 _
 {
   // O2-LABEL: @test_mm_maskz_move_ss
   // O2: %[[M:.*]] = and i8 %__U, 1
-  // O2: %[[M2:.*]] = icmp ne i8 %[[M]], 0
+  // O2: %[[M2:.*]] = icmp
   // O2: %[[ELM1:.*]] = extractelement <4 x float> %__B, i32 0
-  // O2: %[[SEL:.*]] = select i1 %[[M2]], float %[[ELM1]], float 0.0 
+  // O2: %[[SEL:.*]] = select i1 %[[M2]] 
   // O2: %[[RES:.*]] = insertelement <4 x float> %__A, float %[[SEL]], i32 0
   // O2: ret <4 x float> %[[RES]]
   return _mm_maskz_move_ss (__U, __A, __B);
@@ -8265,10 +8268,10 @@ __m128d test_mm_mask_move_sd (__m128d __
 {
   // O2-LABEL: @test_mm_mask_move_sd
   // O2: %[[M:.*]] = and i8 %__U, 1
-  // O2: %[[M2:.*]] = icmp ne i8 %[[M]], 0
-  // O2: %[[ELM1:.*]] = extractelement <2 x double> %__B, i32 0
-  // O2: %[[ELM2:.*]] = extractelement <2 x double> %__W, i32 0
-  // O2: %[[SEL:.*]] = select i1 %[[M2]], double %[[ELM1]], double %[[ELM2]]
+  // O2: %[[M2:.*]] = icmp
+  // O2: %[[ELM1:.*]] = extractelement <2 x double>
+  // O2: %[[ELM2:.*]] = extractelement <2 x double>
+  // O2: %[[SEL:.*]] = select i1 %[[M2]]
   // O2: %[[RES:.*]] = insertelement <2 x double> %__A, double %[[SEL]], i32 0
   // O2: ret <2 x double> %[[RES]]
   return _mm_mask_move_sd ( __W,  __U,  __A,  __B);
@@ -8278,9 +8281,9 @@ __m128d test_mm_maskz_move_sd (__mmask8
 {
   // O2-LABEL: @test_mm_maskz_move_sd
   // O2: %[[M:.*]] = and i8 %__U, 1
-  // O2: %[[M2:.*]] = icmp ne i8 %[[M]], 0
+  // O2: %[[M2:.*]] = icmp
   // O2: %[[ELM1:.*]] = extractelement <2 x double> %__B, i32 0
-  // O2: %[[SEL:.*]] = select i1 %[[M2]], double %[[ELM1]], double 0.0
+  // O2: %[[SEL:.*]] = select i1 %[[M2]]
   // O2: %[[RES:.*]] = insertelement <2 x double> %__A, double %[[SEL]], i32 0
   // O2: ret <2 x double> %[[RES]]
   return _mm_maskz_move_sd (__U, __A, __B);


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r301928 - [CodeGen] remove/fix checks that will fail when r301923 is recommitted

2017-05-02 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Tue May  2 10:20:18 2017
New Revision: 301928

URL: http://llvm.org/viewvc/llvm-project?rev=301928&view=rev
Log:
[CodeGen] remove/fix checks that will fail when r301923 is recommitted

Don't test the optimizer as part of front-end verification.

Modified:
cfe/trunk/test/CodeGen/atomic-ops-libcall.c

Modified: cfe/trunk/test/CodeGen/atomic-ops-libcall.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/atomic-ops-libcall.c?rev=301928&r1=301927&r2=301928&view=diff
==
--- cfe/trunk/test/CodeGen/atomic-ops-libcall.c (original)
+++ cfe/trunk/test/CodeGen/atomic-ops-libcall.c Tue May  2 10:20:18 2017
@@ -1,5 +1,8 @@
 // RUN: %clang_cc1 < %s -triple armv5e-none-linux-gnueabi -emit-llvm -O1 | 
FileCheck %s
 
+// FIXME: This file should not be checking -O1 output.
+// Ie, it is testing many IR optimizer passes as part of front-end 
verification.
+
 enum memory_order {
   memory_order_relaxed, memory_order_consume, memory_order_acquire,
   memory_order_release, memory_order_acq_rel, memory_order_seq_cst
@@ -110,7 +113,8 @@ int test_atomic_xor_fetch(int *p) {
 int test_atomic_nand_fetch(int *p) {
   // CHECK: test_atomic_nand_fetch
   // CHECK: [[CALL:%[^ ]*]] = tail call i32 @__atomic_fetch_nand_4(i8* 
{{%[0-9]+}}, i32 55, i32 5)
-  // CHECK: [[OR:%[^ ]*]] = or i32 [[CALL]], -56
-  // CHECK: {{%[^ ]*}} = xor i32 [[OR]], 55
+  // FIXME: We should not be checking optimized IR. It changes independently 
of clang.
+  // FIXME-CHECK: [[AND:%[^ ]*]] = and i32 [[CALL]], 55
+  // FIXME-CHECK: {{%[^ ]*}} = xor i32 [[AND]], -1
   return __atomic_nand_fetch(p, 55, memory_order_seq_cst);
 }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r300068 - [x86] fix AVX FP cmp intrinsic documentation (PR28110)

2017-04-12 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Wed Apr 12 10:19:08 2017
New Revision: 300068

URL: http://llvm.org/viewvc/llvm-project?rev=300068&view=rev
Log:
[x86] fix AVX FP cmp intrinsic documentation (PR28110)

This copies the text used in the #define statements to the code comments. 
The conflicting text comes from AMD manuals, but those are wrong. Sadly, 
that FP cmp text has not been updated even after some docs were updated 
for Zen:
http://support.amd.com/en-us/search/tech-docs 
( AMD64 Architecture Programmer's Manual Volume 4 )

See PR28110 for more discussion:
https://bugs.llvm.org/show_bug.cgi?id=28110

Differential Revision: https://reviews.llvm.org/D31428

Modified:
cfe/trunk/lib/Headers/avxintrin.h

Modified: cfe/trunk/lib/Headers/avxintrin.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/avxintrin.h?rev=300068&r1=300067&r2=300068&view=diff
==
--- cfe/trunk/lib/Headers/avxintrin.h (original)
+++ cfe/trunk/lib/Headers/avxintrin.h Wed Apr 12 10:19:08 2017
@@ -1613,9 +1613,9 @@ _mm256_blendv_ps(__m256 __a, __m256 __b,
 #define _CMP_NEQ_UQ   0x04 /* Not-equal (unordered, non-signaling)  */
 #define _CMP_NLT_US   0x05 /* Not-less-than (unordered, signaling)  */
 #define _CMP_NLE_US   0x06 /* Not-less-than-or-equal (unordered, signaling)  */
-#define _CMP_ORD_Q0x07 /* Ordered (nonsignaling)   */
+#define _CMP_ORD_Q0x07 /* Ordered (non-signaling)   */
 #define _CMP_EQ_UQ0x08 /* Equal (unordered, non-signaling)  */
-#define _CMP_NGE_US   0x09 /* Not-greater-than-or-equal (unord, signaling)  */
+#define _CMP_NGE_US   0x09 /* Not-greater-than-or-equal (unordered, signaling) 
 */
 #define _CMP_NGT_US   0x0a /* Not-greater-than (unordered, signaling)  */
 #define _CMP_FALSE_OQ 0x0b /* False (ordered, non-signaling)  */
 #define _CMP_NEQ_OQ   0x0c /* Not-equal (ordered, non-signaling)  */
@@ -1628,10 +1628,10 @@ _mm256_blendv_ps(__m256 __a, __m256 __b,
 #define _CMP_UNORD_S  0x13 /* Unordered (signaling)  */
 #define _CMP_NEQ_US   0x14 /* Not-equal (unordered, signaling)  */
 #define _CMP_NLT_UQ   0x15 /* Not-less-than (unordered, non-signaling)  */
-#define _CMP_NLE_UQ   0x16 /* Not-less-than-or-equal (unord, non-signaling)  */
+#define _CMP_NLE_UQ   0x16 /* Not-less-than-or-equal (unordered, 
non-signaling)  */
 #define _CMP_ORD_S0x17 /* Ordered (signaling)  */
 #define _CMP_EQ_US0x18 /* Equal (unordered, signaling)  */
-#define _CMP_NGE_UQ   0x19 /* Not-greater-than-or-equal (unord, non-sign)  */
+#define _CMP_NGE_UQ   0x19 /* Not-greater-than-or-equal (unordered, 
non-signaling)  */
 #define _CMP_NGT_UQ   0x1a /* Not-greater-than (unordered, non-signaling)  */
 #define _CMP_FALSE_OS 0x1b /* False (ordered, signaling)  */
 #define _CMP_NEQ_OS   0x1c /* Not-equal (ordered, signaling)  */
@@ -1660,17 +1660,38 @@ _mm256_blendv_ps(__m256 __a, __m256 __b,
 /// \param c
 ///An immediate integer operand, with bits [4:0] specifying which 
comparison
 ///operation to use: \n
-///00h, 08h, 10h, 18h: Equal \n
-///01h, 09h, 11h, 19h: Less than \n
-///02h, 0Ah, 12h, 1Ah: Less than or equal / Greater than or equal
-///(swapped operands) \n
-///03h, 0Bh, 13h, 1Bh: Unordered \n
-///04h, 0Ch, 14h, 1Ch: Not equal \n
-///05h, 0Dh, 15h, 1Dh: Not less than / Not greater than
-///(swapped operands) \n
-///06h, 0Eh, 16h, 1Eh: Not less than or equal / Not greater than or equal
-///(swapped operands) \n
-///07h, 0Fh, 17h, 1Fh: Ordered
+///0x00 : Equal (ordered, non-signaling)
+///0x01 : Less-than (ordered, signaling)
+///0x02 : Less-than-or-equal (ordered, signaling)
+///0x03 : Unordered (non-signaling)
+///0x04 : Not-equal (unordered, non-signaling)
+///0x05 : Not-less-than (unordered, signaling)
+///0x06 : Not-less-than-or-equal (unordered, signaling)
+///0x07 : Ordered (non-signaling)
+///0x08 : Equal (unordered, non-signaling)
+///0x09 : Not-greater-than-or-equal (unordered, signaling)
+///0x0a : Not-greater-than (unordered, signaling)
+///0x0b : False (ordered, non-signaling)
+///0x0c : Not-equal (ordered, non-signaling)
+///0x0d : Greater-than-or-equal (ordered, signaling)
+///0x0e : Greater-than (ordered, signaling)
+///0x0f : True (unordered, non-signaling)
+///0x10 : Equal (ordered, signaling)
+///0x11 : Less-than (ordered, non-signaling)
+///0x12 : Less-than-or-equal (ordered, non-signaling)
+///0x13 : Unordered (signaling)
+///0x14 : Not-equal (unordered, signaling)
+///0x15 : Not-less-than (unordered, non-signaling)
+///0x16 : Not-less-than-or-equal (unordered, non-signaling)
+///0x17 : Ordered (signaling)
+///0x18 : Equal (unordered, signaling)
+///0x19 : Not-greater-than-or-equal (unordered, non-signaling)
+///0x1a : Not-greater-than (unordered, non-signaling)
+///0x1b : False (ordered, signaling)
+//

r297588 - [x86] these aren't the undefs you're looking for (PR32176)

2017-03-12 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Sun Mar 12 14:15:10 2017
New Revision: 297588

URL: http://llvm.org/viewvc/llvm-project?rev=297588&view=rev
Log:
[x86] these aren't the undefs you're looking for (PR32176)

x86 has undef SSE/AVX intrinsics that should represent a bogus register 
operand. 
This is not the same as LLVM's undef value which can take on multiple bit 
patterns.

There are better solutions / follow-ups to this discussed here:
https://bugs.llvm.org/show_bug.cgi?id=32176
...but this should prevent miscompiles with a one-line code change.

Differential Revision: https://reviews.llvm.org/D30834

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/avx-builtins.c
cfe/trunk/test/CodeGen/avx2-builtins.c
cfe/trunk/test/CodeGen/avx512bw-builtins.c
cfe/trunk/test/CodeGen/avx512dq-builtins.c
cfe/trunk/test/CodeGen/avx512f-builtins.c
cfe/trunk/test/CodeGen/avx512vl-builtins.c
cfe/trunk/test/CodeGen/avx512vldq-builtins.c
cfe/trunk/test/CodeGen/sse-builtins.c
cfe/trunk/test/CodeGen/sse2-builtins.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=297588&r1=297587&r2=297588&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Sun Mar 12 14:15:10 2017
@@ -7381,7 +7381,12 @@ Value *CodeGenFunction::EmitX86BuiltinEx
   case X86::BI__builtin_ia32_undef128:
   case X86::BI__builtin_ia32_undef256:
   case X86::BI__builtin_ia32_undef512:
-return UndefValue::get(ConvertType(E->getType()));
+// The x86 definition of "undef" is not the same as the LLVM definition
+// (PR32176). We leave optimizing away an unnecessary zero constant to the
+// IR optimizer and backend.
+// TODO: If we had a "freeze" IR instruction to generate a fixed undef
+// value, we should use that here instead of a zero.
+return llvm::Constant::getNullValue(ConvertType(E->getType()));
   case X86::BI__builtin_ia32_vec_init_v8qi:
   case X86::BI__builtin_ia32_vec_init_v4hi:
   case X86::BI__builtin_ia32_vec_init_v2si:

Modified: cfe/trunk/test/CodeGen/avx-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx-builtins.c?rev=297588&r1=297587&r2=297588&view=diff
==
--- cfe/trunk/test/CodeGen/avx-builtins.c (original)
+++ cfe/trunk/test/CodeGen/avx-builtins.c Sun Mar 12 14:15:10 2017
@@ -346,19 +346,19 @@ long long test_mm256_extract_epi64(__m25
 
 __m128d test_mm256_extractf128_pd(__m256d A) {
   // CHECK-LABEL: test_mm256_extractf128_pd
-  // CHECK: shufflevector <4 x double> %{{.*}}, <4 x double> undef, <2 x i32> 

+  // CHECK: shufflevector <4 x double> %{{.*}}, <4 x double> zeroinitializer, 
<2 x i32> 
   return _mm256_extractf128_pd(A, 1);
 }
 
 __m128 test_mm256_extractf128_ps(__m256 A) {
   // CHECK-LABEL: test_mm256_extractf128_ps
-  // CHECK: shufflevector <8 x float> %{{.*}}, <8 x float> undef, <4 x i32> 

+  // CHECK: shufflevector <8 x float> %{{.*}}, <8 x float> zeroinitializer, <4 
x i32> 
   return _mm256_extractf128_ps(A, 1);
 }
 
 __m128i test_mm256_extractf128_si256(__m256i A) {
   // CHECK-LABEL: test_mm256_extractf128_si256
-  // CHECK: shufflevector <4 x i64> %{{.*}}, <4 x i64> undef, <2 x i32> 
+  // CHECK: shufflevector <4 x i64> %{{.*}}, <4 x i64> zeroinitializer, <2 x 
i32> 
   return _mm256_extractf128_si256(A, 1);
 }
 
@@ -647,32 +647,32 @@ __m256 test_mm256_or_ps(__m256 A, __m256
 
 __m128d test_mm_permute_pd(__m128d A) {
   // CHECK-LABEL: test_mm_permute_pd
-  // CHECK: shufflevector <2 x double> %{{.*}}, <2 x double> undef, <2 x i32> 

+  // CHECK: shufflevector <2 x double> %{{.*}}, <2 x double> zeroinitializer, 
<2 x i32> 
   return _mm_permute_pd(A, 1);
 }
 
 __m256d test_mm256_permute_pd(__m256d A) {
   // CHECK-LABEL: test_mm256_permute_pd
-  // CHECK: shufflevector <4 x double> %{{.*}}, <4 x double> undef, <4 x i32> 

+  // CHECK: shufflevector <4 x double> %{{.*}}, <4 x double> zeroinitializer, 
<4 x i32> 
   return _mm256_permute_pd(A, 5);
 }
 
 __m128 test_mm_permute_ps(__m128 A) {
   // CHECK-LABEL: test_mm_permute_ps
-  // CHECK: shufflevector <4 x float> %{{.*}}, <4 x float> undef, <4 x i32> 

+  // CHECK: shufflevector <4 x float> %{{.*}}, <4 x float> zeroinitializer, <4 
x i32> 
   return _mm_permute_ps(A, 0x1b);
 }
 
 // Test case for PR12401
 __m128 test2_mm_permute_ps(__m128 a) {
   // CHECK-LABEL: test2_mm_permute_ps
-  // CHECK: shufflevector <4 x float> %{{.*}}, <4 x float> undef, <4 x i32> 

+  // CHECK: shufflevector <4 x float> %{{.*}}, <4 x float> zeroinitializer, <4 
x i32> 
   return _mm_permute_ps(a, 0xe6);
 }
 
 __m256 test_mm256_permute_ps(__m256 A) {
   // CHECK-LABEL: test_mm256_permute_ps
-  // CHECK: shufflevector <8 x float> %{{.*}}, <8 x float> undef, <8 x i32> 

+  // CHECK: shufflevector <8 x float> %{{.*}}, <8 x flo

r294058 - [x86] fix tests with wrong dependency to pass because they broke with r294049

2017-02-03 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Fri Feb  3 16:03:47 2017
New Revision: 294058

URL: http://llvm.org/viewvc/llvm-project?rev=294058&view=rev
Log:
[x86] fix tests with wrong dependency to pass because they broke with r294049

Modified:
cfe/trunk/test/CodeGen/avx512-reduceMinMaxIntrin.c

Modified: cfe/trunk/test/CodeGen/avx512-reduceMinMaxIntrin.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx512-reduceMinMaxIntrin.c?rev=294058&r1=294057&r2=294058&view=diff
==
--- cfe/trunk/test/CodeGen/avx512-reduceMinMaxIntrin.c (original)
+++ cfe/trunk/test/CodeGen/avx512-reduceMinMaxIntrin.c Fri Feb  3 16:03:47 2017
@@ -1,3 +1,5 @@
+// FIXME: We should not be testing with -O2 (ie, a dependency on the entire IR 
optimizer).
+
 // RUN: %clang_cc1 -ffreestanding %s -O2 -triple=x86_64-apple-darwin 
-target-cpu skylake-avx512 -emit-llvm -o - -Wall -Werror |opt -instnamer -S 
|FileCheck %s
 
 #include 
@@ -202,7 +204,7 @@ double test_mm512_mask_reduce_min_pd(__m
 int test_mm512_reduce_max_epi32(__m512i __W){
   // CHECK: %tmp = bitcast <8 x i64> %__W to <16 x i32>
   // CHECK: %shuffle1.i = shufflevector <16 x i32> %tmp, <16 x i32> undef, <16 
x i32> 
-  // CHECK: %tmp1 = icmp sgt <16 x i32> %tmp, %shuffle1.i
+  // CHECK: %tmp1 = icmp slt <16 x i32> %shuffle1.i, %tmp
   // CHECK: %tmp2 = select <16 x i1> %tmp1, <16 x i32> %tmp, <16 x i32> 
%shuffle1.i
   // CHECK: %shuffle3.i = shufflevector <16 x i32> %tmp2, <16 x i32> undef, 
<16 x i32> 
   // CHECK: %tmp3 = icmp sgt <16 x i32> %tmp2, %shuffle3.i
@@ -223,7 +225,7 @@ int test_mm512_reduce_max_epi32(__m512i
 unsigned int test_mm512_reduce_max_epu32(__m512i __W){
   // CHECK: %tmp = bitcast <8 x i64> %__W to <16 x i32>
   // CHECK: %shuffle1.i = shufflevector <16 x i32> %tmp, <16 x i32> undef, <16 
x i32> 
-  // CHECK: %tmp1 = icmp ugt <16 x i32> %tmp, %shuffle1.i
+  // CHECK: %tmp1 = icmp ult <16 x i32> %shuffle1.i, %tmp
   // CHECK: %tmp2 = select <16 x i1> %tmp1, <16 x i32> %tmp, <16 x i32> 
%shuffle1.i
   // CHECK: %shuffle3.i = shufflevector <16 x i32> %tmp2, <16 x i32> undef, 
<16 x i32> 
   // CHECK: %tmp3 = icmp ugt <16 x i32> %tmp2, %shuffle3.i
@@ -258,7 +260,7 @@ float test_mm512_reduce_max_ps(__m512 __
 int test_mm512_reduce_min_epi32(__m512i __W){
   // CHECK: %tmp = bitcast <8 x i64> %__W to <16 x i32>
   // CHECK: %shuffle1.i = shufflevector <16 x i32> %tmp, <16 x i32> undef, <16 
x i32> 
-  // CHECK: %tmp1 = icmp slt <16 x i32> %tmp, %shuffle1.i
+  // CHECK: %tmp1 = icmp sgt <16 x i32> %shuffle1.i, %tmp
   // CHECK: %tmp2 = select <16 x i1> %tmp1, <16 x i32> %tmp, <16 x i32> 
%shuffle1.i
   // CHECK: %shuffle3.i = shufflevector <16 x i32> %tmp2, <16 x i32> undef, 
<16 x i32> 
   // CHECK: %tmp3 = icmp slt <16 x i32> %tmp2, %shuffle3.i
@@ -279,7 +281,7 @@ int test_mm512_reduce_min_epi32(__m512i
 unsigned int test_mm512_reduce_min_epu32(__m512i __W){
   // CHECK: %tmp = bitcast <8 x i64> %__W to <16 x i32>
   // CHECK: %shuffle1.i = shufflevector <16 x i32> %tmp, <16 x i32> undef, <16 
x i32> 
-  // CHECK: %tmp1 = icmp ult <16 x i32> %tmp, %shuffle1.i
+  // CHECK: %tmp1 = icmp ugt <16 x i32> %shuffle1.i, %tmp
   // CHECK: %tmp2 = select <16 x i1> %tmp1, <16 x i32> %tmp, <16 x i32> 
%shuffle1.i
   // CHECK: %shuffle3.i = shufflevector <16 x i32> %tmp2, <16 x i32> undef, 
<16 x i32> 
   // CHECK: %tmp3 = icmp ult <16 x i32> %tmp2, %shuffle3.i


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24397: Target Power9 bit counting and vector comparison instructions through builtins (front end portion)

2016-10-05 Thread Sanjay Patel via cfe-commits
You should not need to account for any nsw/nuw flags if the clang test does
not enable the optimizer.
Ie, D24955 should not be running at -O0.

On Wed, Oct 5, 2016 at 1:09 PM, Nemanja Ivanovic 
wrote:

> OK, I get testing that I'm fine with if I remove the -O2 and the checks
> for 'select i1'.
>
> Does that change suffice for the purposes of
> https://reviews.llvm.org/D24955?
>
> Namely, do I need to account for the possible addition of nsw/nuw flags to
> the add instructions even without -O2?
>
> On Wed, Oct 5, 2016 at 8:24 PM, Sanjay Patel 
> wrote:
>
>> spatel added a comment.
>>
>> In https://reviews.llvm.org/D24397#562469, @bjope wrote:
>>
>> > (I'm still hesitating about commiting https://reviews.llvm.org/D24955
>> in llvm since that would make these clang tests fail...)
>>
>>
>> You can't do that. Bots will send you fail mail all day as they choke on
>> the clang tests - speaking from experience. :)
>> We either need to fix or revert this commit in order to let
>> https://reviews.llvm.org/D24955 proceed.
>>
>>
>> Repository:
>>   rL LLVM
>>
>> https://reviews.llvm.org/D24397
>>
>>
>>
>>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24397: Target Power9 bit counting and vector comparison instructions through builtins (front end portion)

2016-10-05 Thread Sanjay Patel via cfe-commits
spatel added a comment.

In https://reviews.llvm.org/D24397#562469, @bjope wrote:

> (I'm still hesitating about commiting https://reviews.llvm.org/D24955 in llvm 
> since that would make these clang tests fail...)


You can't do that. Bots will send you fail mail all day as they choke on the 
clang tests - speaking from experience. :)
We either need to fix or revert this commit in order to let 
https://reviews.llvm.org/D24955 proceed.


Repository:
  rL LLVM

https://reviews.llvm.org/D24397



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-10-04 Thread Sanjay Patel via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283251: [clang] make reciprocal estimate codegen a function 
attribute (authored by spatel).

Changed prior to commit:
  https://reviews.llvm.org/D24815?vs=73364&id=73548#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D24815

Files:
  cfe/trunk/lib/CodeGen/BackendUtil.cpp
  cfe/trunk/lib/CodeGen/CGCall.cpp
  cfe/trunk/test/CodeGen/attr-mrecip.c


Index: cfe/trunk/test/CodeGen/attr-mrecip.c
===
--- cfe/trunk/test/CodeGen/attr-mrecip.c
+++ cfe/trunk/test/CodeGen/attr-mrecip.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -mrecip=!sqrtf,vec-divf:3 -emit-llvm %s -o - | FileCheck %s
+
+int baz(int a) { return 4; }
+
+// CHECK: baz{{.*}} #0
+// CHECK: #0 = {{.*}}"reciprocal-estimates"="!sqrtf,vec-divf:3"
+
Index: cfe/trunk/lib/CodeGen/CGCall.cpp
===
--- cfe/trunk/lib/CodeGen/CGCall.cpp
+++ cfe/trunk/lib/CodeGen/CGCall.cpp
@@ -1730,6 +1730,9 @@
 
 FuncAttrs.addAttribute("no-trapping-math",
llvm::toStringRef(CodeGenOpts.NoTrappingMath));
+
+// TODO: Are these all needed?
+// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));
 FuncAttrs.addAttribute("no-nans-fp-math",
@@ -1746,6 +1749,12 @@
 "correctly-rounded-divide-sqrt-fp-math",
 llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));
 
+// TODO: Reciprocal estimate codegen options should apply to instructions?
+std::vector &Recips = getTarget().getTargetOpts().Reciprocals;
+if (!Recips.empty())
+  FuncAttrs.addAttribute("reciprocal-estimates",
+ llvm::join(Recips.begin(), Recips.end(), ","));
+
 if (CodeGenOpts.StackRealignment)
   FuncAttrs.addAttribute("stackrealign");
 if (CodeGenOpts.Backchain)
Index: cfe/trunk/lib/CodeGen/BackendUtil.cpp
===
--- cfe/trunk/lib/CodeGen/BackendUtil.cpp
+++ cfe/trunk/lib/CodeGen/BackendUtil.cpp
@@ -533,9 +533,6 @@
 
   llvm::TargetOptions Options;
 
-  if (!TargetOpts.Reciprocals.empty())
-Options.Reciprocals = TargetRecip(TargetOpts.Reciprocals);
-
   Options.ThreadModel =
 llvm::StringSwitch(CodeGenOpts.ThreadModel)
   .Case("posix", llvm::ThreadModel::POSIX)


Index: cfe/trunk/test/CodeGen/attr-mrecip.c
===
--- cfe/trunk/test/CodeGen/attr-mrecip.c
+++ cfe/trunk/test/CodeGen/attr-mrecip.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -mrecip=!sqrtf,vec-divf:3 -emit-llvm %s -o - | FileCheck %s
+
+int baz(int a) { return 4; }
+
+// CHECK: baz{{.*}} #0
+// CHECK: #0 = {{.*}}"reciprocal-estimates"="!sqrtf,vec-divf:3"
+
Index: cfe/trunk/lib/CodeGen/CGCall.cpp
===
--- cfe/trunk/lib/CodeGen/CGCall.cpp
+++ cfe/trunk/lib/CodeGen/CGCall.cpp
@@ -1730,6 +1730,9 @@
 
 FuncAttrs.addAttribute("no-trapping-math",
llvm::toStringRef(CodeGenOpts.NoTrappingMath));
+
+// TODO: Are these all needed?
+// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));
 FuncAttrs.addAttribute("no-nans-fp-math",
@@ -1746,6 +1749,12 @@
 "correctly-rounded-divide-sqrt-fp-math",
 llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));
 
+// TODO: Reciprocal estimate codegen options should apply to instructions?
+std::vector &Recips = getTarget().getTargetOpts().Reciprocals;
+if (!Recips.empty())
+  FuncAttrs.addAttribute("reciprocal-estimates",
+ llvm::join(Recips.begin(), Recips.end(), ","));
+
 if (CodeGenOpts.StackRealignment)
   FuncAttrs.addAttribute("stackrealign");
 if (CodeGenOpts.Backchain)
Index: cfe/trunk/lib/CodeGen/BackendUtil.cpp
===
--- cfe/trunk/lib/CodeGen/BackendUtil.cpp
+++ cfe/trunk/lib/CodeGen/BackendUtil.cpp
@@ -533,9 +533,6 @@
 
   llvm::TargetOptions Options;
 
-  if (!TargetOpts.Reciprocals.empty())
-Options.Reciprocals = TargetRecip(TargetOpts.Reciprocals);
-
   Options.ThreadModel =
 llvm::StringSwitch(CodeGenOpts.ThreadModel)
   .Case("posix", llvm::ThreadModel::POSIX)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283251 - [clang] make reciprocal estimate codegen a function attribute

2016-10-04 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Tue Oct  4 15:44:05 2016
New Revision: 283251

URL: http://llvm.org/viewvc/llvm-project?rev=283251&view=rev
Log:
[clang] make reciprocal estimate codegen a function attribute

The motivation for the change is that we can't have pseudo-global settings
for codegen living in TargetOptions because that doesn't work with LTO.

Ideally, these reciprocal attributes will be moved to the instruction-level
via FMF, metadata, or something else. But making them function attributes is
at least an improvement over the current state.

I'm committing this patch ahead of the related LLVM patch to avoid bot failures,
but if that patch needs to be reverted, then this should be reverted too.

Differential Revision: https://reviews.llvm.org/D24815

Added:
cfe/trunk/test/CodeGen/attr-mrecip.c
Modified:
cfe/trunk/lib/CodeGen/BackendUtil.cpp
cfe/trunk/lib/CodeGen/CGCall.cpp

Modified: cfe/trunk/lib/CodeGen/BackendUtil.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/BackendUtil.cpp?rev=283251&r1=283250&r2=283251&view=diff
==
--- cfe/trunk/lib/CodeGen/BackendUtil.cpp (original)
+++ cfe/trunk/lib/CodeGen/BackendUtil.cpp Tue Oct  4 15:44:05 2016
@@ -533,9 +533,6 @@ void EmitAssemblyHelper::CreateTargetMac
 
   llvm::TargetOptions Options;
 
-  if (!TargetOpts.Reciprocals.empty())
-Options.Reciprocals = TargetRecip(TargetOpts.Reciprocals);
-
   Options.ThreadModel =
 llvm::StringSwitch(CodeGenOpts.ThreadModel)
   .Case("posix", llvm::ThreadModel::POSIX)

Modified: cfe/trunk/lib/CodeGen/CGCall.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCall.cpp?rev=283251&r1=283250&r2=283251&view=diff
==
--- cfe/trunk/lib/CodeGen/CGCall.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGCall.cpp Tue Oct  4 15:44:05 2016
@@ -1730,6 +1730,9 @@ void CodeGenModule::ConstructAttributeLi
 
 FuncAttrs.addAttribute("no-trapping-math",
llvm::toStringRef(CodeGenOpts.NoTrappingMath));
+
+// TODO: Are these all needed?
+// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));
 FuncAttrs.addAttribute("no-nans-fp-math",
@@ -1746,6 +1749,12 @@ void CodeGenModule::ConstructAttributeLi
 "correctly-rounded-divide-sqrt-fp-math",
 llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));
 
+// TODO: Reciprocal estimate codegen options should apply to instructions?
+std::vector &Recips = getTarget().getTargetOpts().Reciprocals;
+if (!Recips.empty())
+  FuncAttrs.addAttribute("reciprocal-estimates",
+ llvm::join(Recips.begin(), Recips.end(), ","));
+
 if (CodeGenOpts.StackRealignment)
   FuncAttrs.addAttribute("stackrealign");
 if (CodeGenOpts.Backchain)

Added: cfe/trunk/test/CodeGen/attr-mrecip.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/attr-mrecip.c?rev=283251&view=auto
==
--- cfe/trunk/test/CodeGen/attr-mrecip.c (added)
+++ cfe/trunk/test/CodeGen/attr-mrecip.c Tue Oct  4 15:44:05 2016
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -mrecip=!sqrtf,vec-divf:3 -emit-llvm %s -o - | FileCheck %s
+
+int baz(int a) { return 4; }
+
+// CHECK: baz{{.*}} #0
+// CHECK: #0 = {{.*}}"reciprocal-estimates"="!sqrtf,vec-divf:3"
+


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-10-04 Thread Sanjay Patel via cfe-commits
spatel added inline comments.


> mehdi_amini wrote in CGCall.cpp:1735
> I think I remember folks being against FMF on calls (Chris Lattner?), I'll 
> try to find the relevant discussion.
> Otherwise your plan seems fine to me!

Yes - Chris was opposed to FMF on intrinsics (preferring parameters/metadata as 
the delivery mechanism instead) in 2012:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20121217/159446.html

...which you mentioned and I replied to in the post-commit thread when I added 
FMF to any FPMathOperator calls earlier this year:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160104/323154.html

There were no replies to that thread on llvm-dev since my January post. I will 
re-post the question to llvm-dev before proceeding.

https://reviews.llvm.org/D24815



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-10-04 Thread Sanjay Patel via cfe-commits
spatel added inline comments.


> mehdi_amini wrote in CGCall.cpp:1735
> I wonder if we couldn’t have this part of the bitcode/IR auto-upgrade: when 
> we load a function with this attribute, we automatically add the individual 
> flag on every instruction.

Auto-upgrading is part of the solution. Based on how we've been doing this with 
vector intrinsics that get converted to IR, it's a ~3-step process:

1. Prepare the backend (DAG) to handle the expected new IR patterns and add 
tests for those.
2. Auto-upgrade the IR, remove deprecated handling of the old IR patterns, and 
change/remove existing tests.
3. Update clang to not produce the deprecated patterns.

The extra step for FMF in the DAG is that we still don't allow FMF on all 
SDNode subclasses. The DAG plumbing for FMF only applies to binops because 
that's all that FMF on IR worked on at the time (fmul/fadd/fsub/fdiv/frem). 
Later, I added FMF to IR calls so we could have that functionality on sqrt, fma 
and other calls. Assuming that is ok (and I realize that it may be 
controversial), we can now extend FMF in the DAG to all SDNodes and have a full 
node-level FMF solution for the DAG layer.

https://reviews.llvm.org/D24815



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-10-03 Thread Sanjay Patel via cfe-commits
spatel added inline comments.


> mehdi_amini wrote in CGCall.cpp:1735
> I agree with getting on a path to remove these function attributes that have 
> an equivalent on per-instruction flag.
> 
> I wonder what is the status of these flags in SelectionDAG though? We still 
> have a variant of the flags on the TargetOptions I believe. Are all the uses 
> migrated to per-node flags?

Good point - I think we have to convert all codegen tests that have these 
function-level attributes to IR FMF and make sure that the output doesn't 
change. Definitely not part of this patch, but hopefully something that can be 
done incrementally, test-by-test.

https://reviews.llvm.org/D24815



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-10-03 Thread Sanjay Patel via cfe-commits
spatel updated this revision to Diff 73364.
spatel added a comment.

Patch updated as suggested by Eric:

1. The attribute is named "reciprocal-estimates".
2. Remove unnecessary -disable-llvm-optzns flag from test file.

Quick fixes, but this will not go in until the LLVM side 
(https://reviews.llvm.org/D24816) is updated, and we've answered any remaining 
questions there.


https://reviews.llvm.org/D24815

Files:
  lib/CodeGen/BackendUtil.cpp
  lib/CodeGen/CGCall.cpp
  test/CodeGen/attr-mrecip.c


Index: test/CodeGen/attr-mrecip.c
===
--- test/CodeGen/attr-mrecip.c
+++ test/CodeGen/attr-mrecip.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -mrecip=!sqrtf,vec-divf:3 -emit-llvm %s -o - | FileCheck %s
+
+int baz(int a) { return 4; }
+
+// CHECK: baz{{.*}} #0
+// CHECK: #0 = {{.*}}"reciprocal-estimates"="!sqrtf,vec-divf:3"
+
Index: lib/CodeGen/CGCall.cpp
===
--- lib/CodeGen/CGCall.cpp
+++ lib/CodeGen/CGCall.cpp
@@ -1730,6 +1730,9 @@
 
 FuncAttrs.addAttribute("no-trapping-math",
llvm::toStringRef(CodeGenOpts.NoTrappingMath));
+
+// TODO: Are these all needed?
+// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));
 FuncAttrs.addAttribute("no-nans-fp-math",
@@ -1746,6 +1749,12 @@
 "correctly-rounded-divide-sqrt-fp-math",
 llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));
 
+// TODO: Reciprocal estimate codegen options should apply to instructions?
+std::vector &Recips = getTarget().getTargetOpts().Reciprocals;
+if (!Recips.empty())
+  FuncAttrs.addAttribute("reciprocal-estimates",
+ llvm::join(Recips.begin(), Recips.end(), ","));
+
 if (CodeGenOpts.StackRealignment)
   FuncAttrs.addAttribute("stackrealign");
 if (CodeGenOpts.Backchain)
Index: lib/CodeGen/BackendUtil.cpp
===
--- lib/CodeGen/BackendUtil.cpp
+++ lib/CodeGen/BackendUtil.cpp
@@ -529,9 +529,6 @@
 
   llvm::TargetOptions Options;
 
-  if (!TargetOpts.Reciprocals.empty())
-Options.Reciprocals = TargetRecip(TargetOpts.Reciprocals);
-
   Options.ThreadModel =
 llvm::StringSwitch(CodeGenOpts.ThreadModel)
   .Case("posix", llvm::ThreadModel::POSIX)


Index: test/CodeGen/attr-mrecip.c
===
--- test/CodeGen/attr-mrecip.c
+++ test/CodeGen/attr-mrecip.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -mrecip=!sqrtf,vec-divf:3 -emit-llvm %s -o - | FileCheck %s
+
+int baz(int a) { return 4; }
+
+// CHECK: baz{{.*}} #0
+// CHECK: #0 = {{.*}}"reciprocal-estimates"="!sqrtf,vec-divf:3"
+
Index: lib/CodeGen/CGCall.cpp
===
--- lib/CodeGen/CGCall.cpp
+++ lib/CodeGen/CGCall.cpp
@@ -1730,6 +1730,9 @@
 
 FuncAttrs.addAttribute("no-trapping-math",
llvm::toStringRef(CodeGenOpts.NoTrappingMath));
+
+// TODO: Are these all needed?
+// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));
 FuncAttrs.addAttribute("no-nans-fp-math",
@@ -1746,6 +1749,12 @@
 "correctly-rounded-divide-sqrt-fp-math",
 llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));
 
+// TODO: Reciprocal estimate codegen options should apply to instructions?
+std::vector &Recips = getTarget().getTargetOpts().Reciprocals;
+if (!Recips.empty())
+  FuncAttrs.addAttribute("reciprocal-estimates",
+ llvm::join(Recips.begin(), Recips.end(), ","));
+
 if (CodeGenOpts.StackRealignment)
   FuncAttrs.addAttribute("stackrealign");
 if (CodeGenOpts.Backchain)
Index: lib/CodeGen/BackendUtil.cpp
===
--- lib/CodeGen/BackendUtil.cpp
+++ lib/CodeGen/BackendUtil.cpp
@@ -529,9 +529,6 @@
 
   llvm::TargetOptions Options;
 
-  if (!TargetOpts.Reciprocals.empty())
-Options.Reciprocals = TargetRecip(TargetOpts.Reciprocals);
-
   Options.ThreadModel =
 llvm::StringSwitch(CodeGenOpts.ThreadModel)
   .Case("posix", llvm::ThreadModel::POSIX)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-10-03 Thread Sanjay Patel via cfe-commits
spatel added inline comments.


> echristo wrote in CGCall.cpp:1735
> Would be nice to get these pulled into a single fast-math string that's set 
> and then used all over for sure. :)

I'm probably not imagining some use case, but I was hoping that we can just 
delete the 4 (fast/inf/nan/nsz) that are already covered by instruction-level 
FMF. An auto-upgrade might be needed within LLVM...and/or a big pile of 
regression test changes?

https://reviews.llvm.org/D24815



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-10-03 Thread Sanjay Patel via cfe-commits
spatel marked 2 inline comments as done.
spatel added a comment.

Thanks, Eric. I actually drafted this with the name "recip-estimates", but 
thought there might be value in reusing the programmer-visible flag name. I'm 
good with "reciprocal-estimates" too.


https://reviews.llvm.org/D24815



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-09-30 Thread Sanjay Patel via cfe-commits
spatel added a comment.

Ping.


https://reviews.llvm.org/D24815



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24397: Target Power9 bit counting and vector comparison instructions through builtins (front end portion)

2016-09-28 Thread Sanjay Patel via cfe-commits
spatel added a comment.

Should also mention:
https://reviews.llvm.org/D17999
has scripts attached that could make this kind of test generation a lot easier. 
:)


Repository:
  rL LLVM

https://reviews.llvm.org/D24397



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24397: Target Power9 bit counting and vector comparison instructions through builtins (front end portion)

2016-09-28 Thread Sanjay Patel via cfe-commits
spatel added a comment.

In https://reviews.llvm.org/D24397#52, @nemanjai wrote:

> In https://reviews.llvm.org/D24397#555470, @spatel wrote:
>
> > Having a clang regression/unit test that depends on optimizer behavior is 
> > generally viewed as wrong. Can the tests be split into front-end (clang) 
> > tests and separate tests for the IR optimizer? Both x86 and AArch64 have 
> > done something like that in the last few months for testing of 
> > builtins/intrinsics.
>
>
> Yeah, that sounds reasonable. I'll remove the -O2 from the test case and 
> remove the checks for the select instructions. That's really the only major 
> difference. So am I to understand the nsw/nuw flags will not be added without 
> -O2 and the aforementioned changes will suffice?


Changing to -O0 or using -disable-llvm-optzns should keep the clang tests from 
breaking due to underlying changes in the IR optimizer. That may lead to a lot 
of bloat though. In 
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20160307/152324.html , 
it was viewed as ok, if not ideal, to pipe the clang IR output using "opt -S 
-mem2reg".

Note that clang itself uses APIs like IRBuilder::CreateNUWSub(), so I think 
it's possible to see no-wrap IR even without the IR optimizer kicking in (but 
probably isn't a concern in this case?).


Repository:
  rL LLVM

https://reviews.llvm.org/D24397



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24397: Target Power9 bit counting and vector comparison instructions through builtins (front end portion)

2016-09-28 Thread Sanjay Patel via cfe-commits
spatel added a subscriber: spatel.
spatel added a comment.

Having a clang regression/unit test that depends on optimizer behavior is 
generally viewed as wrong. Can the tests be split into front-end (clang) tests 
and separate tests for the IR optimizer? Both x86 and AArch64 have done 
something like that in the last few months for testing of builtins/intrinsics.


Repository:
  rL LLVM

https://reviews.llvm.org/D24397



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24815: [clang] make reciprocal estimate codegen a function attribute

2016-09-21 Thread Sanjay Patel via cfe-commits
spatel created this revision.
spatel added reviewers: echristo, evandro, hfinkel.
spatel added a subscriber: cfe-commits.
Herald added subscribers: mehdi_amini, mcrosier.

Technically, I suppose this patch is independent of the upcoming llvm sibling 
patch because we can still pass 'check-all' with this alone. But this patch 
should be tightly coupled with that patch when committed because there's 
nothing currently in llvm to read this new function attribute string.

The motivation for the change is that we can't have pseudo-global settings for 
codegen living in TargetOptions because that doesn't work with LTO. And yet, 
there are so many others there...

Ideally, these reciprocal attributes will be moved to the instruction-level via 
FMF, metadata, or something else. But making them function attributes is at 
least an improvement over the current mess.

https://reviews.llvm.org/D24815

Files:
  lib/CodeGen/BackendUtil.cpp
  lib/CodeGen/CGCall.cpp
  test/CodeGen/attr-mrecip.c

Index: test/CodeGen/attr-mrecip.c
===
--- test/CodeGen/attr-mrecip.c
+++ test/CodeGen/attr-mrecip.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -mrecip=!sqrtf,vec-divf:3 -disable-llvm-optzns -emit-llvm 
%s -o - | FileCheck %s
+
+int baz(int a) { return 4; }
+
+// CHECK: baz{{.*}} #0
+// CHECK: #0 = {{.*}}"mrecip"="!sqrtf,vec-divf:3"
+
Index: lib/CodeGen/CGCall.cpp
===
--- lib/CodeGen/CGCall.cpp
+++ lib/CodeGen/CGCall.cpp
@@ -1730,6 +1730,9 @@
 
 FuncAttrs.addAttribute("no-trapping-math",
llvm::toStringRef(CodeGenOpts.NoTrappingMath));
+
+// TODO: Are these all needed?
+// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));
 FuncAttrs.addAttribute("no-nans-fp-math",
@@ -1746,6 +1749,12 @@
 "correctly-rounded-divide-sqrt-fp-math",
 llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));
 
+// TODO: Reciprocal estimate codegen options should apply to instructions?
+std::vector &Recips = getTarget().getTargetOpts().Reciprocals;
+if (!Recips.empty())
+  FuncAttrs.addAttribute("mrecip",
+ llvm::join(Recips.begin(), Recips.end(), ","));
+
 if (CodeGenOpts.StackRealignment)
   FuncAttrs.addAttribute("stackrealign");
 if (CodeGenOpts.Backchain)
Index: lib/CodeGen/BackendUtil.cpp
===
--- lib/CodeGen/BackendUtil.cpp
+++ lib/CodeGen/BackendUtil.cpp
@@ -529,9 +529,6 @@
 
   llvm::TargetOptions Options;
 
-  if (!TargetOpts.Reciprocals.empty())
-Options.Reciprocals = TargetRecip(TargetOpts.Reciprocals);
-
   Options.ThreadModel =
 llvm::StringSwitch(CodeGenOpts.ThreadModel)
   .Case("posix", llvm::ThreadModel::POSIX)


Index: test/CodeGen/attr-mrecip.c
===
--- test/CodeGen/attr-mrecip.c
+++ test/CodeGen/attr-mrecip.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -mrecip=!sqrtf,vec-divf:3 -disable-llvm-optzns -emit-llvm %s -o - | FileCheck %s
+
+int baz(int a) { return 4; }
+
+// CHECK: baz{{.*}} #0
+// CHECK: #0 = {{.*}}"mrecip"="!sqrtf,vec-divf:3"
+
Index: lib/CodeGen/CGCall.cpp
===
--- lib/CodeGen/CGCall.cpp
+++ lib/CodeGen/CGCall.cpp
@@ -1730,6 +1730,9 @@
 
 FuncAttrs.addAttribute("no-trapping-math",
llvm::toStringRef(CodeGenOpts.NoTrappingMath));
+
+// TODO: Are these all needed?
+// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
 FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));
 FuncAttrs.addAttribute("no-nans-fp-math",
@@ -1746,6 +1749,12 @@
 "correctly-rounded-divide-sqrt-fp-math",
 llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));
 
+// TODO: Reciprocal estimate codegen options should apply to instructions?
+std::vector &Recips = getTarget().getTargetOpts().Reciprocals;
+if (!Recips.empty())
+  FuncAttrs.addAttribute("mrecip",
+ llvm::join(Recips.begin(), Recips.end(), ","));
+
 if (CodeGenOpts.StackRealignment)
   FuncAttrs.addAttribute("stackrealign");
 if (CodeGenOpts.Backchain)
Index: lib/CodeGen/BackendUtil.cpp
===
--- lib/CodeGen/BackendUtil.cpp
+++ lib/CodeGen/BackendUtil.cpp
@@ -529,9 +529,6 @@
 
   llvm::TargetOptions Options;
 
-  if (!TargetOpts.Reciprocals.empty())
-Options.Reciprocals = TargetRecip(TargetOpts.Reciprocals);
-
   Options.ThreadModel =
 llvm::StringSwitch(CodeGenOpts.ThreadModel)
   .Case("posix", llvm::ThreadModel::POSIX)
___

Re: [PATCH] D19544: Pass for translating math intrinsics to math library calls.

2016-07-19 Thread Sanjay Patel via cfe-commits
spatel added a subscriber: davide.
spatel added a comment.

In https://reviews.llvm.org/D19544#488589, @mmasten wrote:

> In the process of writing test cases, I noticed that a loop with a call to 
> llvm.log.f32 was not getting vectorized due to cost modeling. When forcing 
> vectorization on the loop and throwing -fveclib=SVML, the loop was vectorized 
> with a widened intrinsic instead of the svml call. Is this correct? I would 
> have expected to get the svml call. In light of this, wouldn't it be better 
> to represent the math calls with vector intrinsics and let CodeGenPrepare or 
> the backends decide how to lower them?


I don't know the answer, but I'm curious about this too for an unrelated change 
in LibCallSimplifier (cc @davide).

The LangRef has this boilerplate for all target-independent math intrinsics:
"Not all targets support all types however."

Is that only intended for the weird types (x86_fp80, ppc_fp128, fp128?), or 
does it mean that we shouldn't create these intrinsics for vectors with 
standard FP types (eg, v4f32)?


https://reviews.llvm.org/D19544



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D19544: Pass for translating math intrinsics to math library calls.

2016-07-14 Thread Sanjay Patel via cfe-commits
spatel added a comment.

Hi Matt -

This looks like the right first step in the path that Hal suggested, except I 
think we need a test case for each function that you want to enable. Please see 
test/Transforms/LoopVectorize/X86/veclib-calls.ll as a reference for how to do 
that.


https://reviews.llvm.org/D19544



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r274278 - fix typo; NFC

2016-06-30 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Thu Jun 30 16:02:40 2016
New Revision: 274278

URL: http://llvm.org/viewvc/llvm-project?rev=274278&view=rev
Log:
fix typo; NFC

Modified:
cfe/trunk/lib/CodeGen/CodeGenModule.cpp

Modified: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.cpp?rev=274278&r1=274277&r2=274278&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenModule.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp Thu Jun 30 16:02:40 2016
@@ -2610,7 +2610,7 @@ static bool isVarDeclStrongDefinition(co
   if (shouldBeInCOMDAT(CGM, *D))
 return true;
 
-  // Declarations with a required alignment do not have common linakge in MSVC
+  // Declarations with a required alignment do not have common linkage in MSVC
   // mode.
   if (Context.getTargetInfo().getCXXABI().isMicrosoft()) {
 if (D->hasAttr())


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21306: [x86] AVX FP compare builtins should require AVX target feature (PR28112)

2016-06-21 Thread Sanjay Patel via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL273311: [x86] AVX FP compare builtins should require AVX 
target feature (PR28112) (authored by spatel).

Changed prior to commit:
  http://reviews.llvm.org/D21306?vs=60596&id=61437#toc

Repository:
  rL LLVM

http://reviews.llvm.org/D21306

Files:
  cfe/trunk/include/clang/Basic/BuiltinsX86.def
  cfe/trunk/test/CodeGen/target-features-error-2.c

Index: cfe/trunk/include/clang/Basic/BuiltinsX86.def
===
--- cfe/trunk/include/clang/Basic/BuiltinsX86.def
+++ cfe/trunk/include/clang/Basic/BuiltinsX86.def
@@ -219,16 +219,14 @@
 TARGET_BUILTIN(__builtin_ia32_ucomisdge, "iV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_ucomisdneq, "iV2dV2d", "", "sse2")
 
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpeqps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpltps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpleps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpunordps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpneqps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpnltps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpnleps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpordps, "V4fV4fV4f", "", "sse")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpeqss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpltss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpless, "V4fV4fV4f", "", "sse")
@@ -242,16 +240,14 @@
 TARGET_BUILTIN(__builtin_ia32_minss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_maxss, "V4fV4fV4f", "", "sse")
 
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpeqpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpltpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmplepd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpunordpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpneqpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnltpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnlepd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpordpd, "V2dV2dV2d", "", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpeqsd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpltsd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmplesd, "V2dV2dV2d", "", "sse2")
@@ -453,8 +449,12 @@
 TARGET_BUILTIN(__builtin_ia32_blendvpd256, "V4dV4dV4dV4d", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_blendvps256, "V8fV8fV8fV8f", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cvtdq2ps256, "V8fV8i", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cvtpd2ps256, "V4fV4d", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cvtps2dq256, "V8iV8f", "", "avx")
Index: cfe/trunk/test/CodeGen/target-features-error-2.c
===
--- cfe/trunk/test/CodeGen/target-features-error-2.c
+++ cfe/trunk/test/CodeGen/target-features-error-2.c
@@ -1,7 +1,38 @@
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o -
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_SSE42
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_1
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+
 #define __MM_MALLOC_H
 #include 
 
+#if NEED_SSE42
 int baz(__m256i a) {
   return _mm256_extract_epi32(a, 3); // expected-error {{always_inline 
function '_mm256_extract_epi32' requires target feature 'sse4.2', but would be 
inlined into function 'baz' that is compiled without support for 'sse4.2'}}
 }
+#endif
+
+#if NEED_AVX_1
+__m128 need_avx(__m128 a, __m128 b) {
+  return _mm_cmp_ps(a, b, 0); // expected-error {{'__builtin_ia32_cmpps' needs 
target feature avx}}
+}
+#endif
+
+#if NEED_AVX_2
+__m128 need_avx(__m128 a, __m128 b) {
+  return _mm_cmp_ss(a, b, 0); // expected-error {{'__builtin_ia32_cmpss' needs 
target feature avx}}
+}
+#endif
+
+#if NEED_AVX_3
+__m128d need_avx(__m128d a, __m128d b) {
+  return _mm_cmp_pd(a, b, 0); // expected-error {{'__builtin_ia32_cmppd' needs 
target feature avx}}
+}
+#endi

r273311 - [x86] AVX FP compare builtins should require AVX target feature (PR28112)

2016-06-21 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Tue Jun 21 15:22:55 2016
New Revision: 273311

URL: http://llvm.org/viewvc/llvm-project?rev=273311&view=rev
Log:
[x86] AVX FP compare builtins should require AVX target feature (PR28112)

This is a fix for PR28112:
https://llvm.org/bugs/show_bug.cgi?id=28112

The FP comparison intrinsics that take an immediate parameter (rather than 
specifying
a comparison predicate in the function name) were added with AVX; these are 
macros in
avxintrin.h. This patch makes clang behavior match gcc (error if a program 
tries to use 
these without -mavx) and matches the Intel documentation, eg:
VCMPPS: m128 _mm_cmp_ps(m128 a, __m128 b, const int imm)

'V' means this is intended to only work with the AVX form of the instruction.

Differential Revision: http://reviews.llvm.org/D21306


Modified:
cfe/trunk/include/clang/Basic/BuiltinsX86.def
cfe/trunk/test/CodeGen/target-features-error-2.c

Modified: cfe/trunk/include/clang/Basic/BuiltinsX86.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/BuiltinsX86.def?rev=273311&r1=273310&r2=273311&view=diff
==
--- cfe/trunk/include/clang/Basic/BuiltinsX86.def (original)
+++ cfe/trunk/include/clang/Basic/BuiltinsX86.def Tue Jun 21 15:22:55 2016
@@ -219,7 +219,6 @@ TARGET_BUILTIN(__builtin_ia32_ucomisdgt,
 TARGET_BUILTIN(__builtin_ia32_ucomisdge, "iV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_ucomisdneq, "iV2dV2d", "", "sse2")
 
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpeqps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpltps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpleps, "V4fV4fV4f", "", "sse")
@@ -228,7 +227,6 @@ TARGET_BUILTIN(__builtin_ia32_cmpneqps,
 TARGET_BUILTIN(__builtin_ia32_cmpnltps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpnleps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpordps, "V4fV4fV4f", "", "sse")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpeqss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpltss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpless, "V4fV4fV4f", "", "sse")
@@ -242,7 +240,6 @@ TARGET_BUILTIN(__builtin_ia32_maxps, "V4
 TARGET_BUILTIN(__builtin_ia32_minss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_maxss, "V4fV4fV4f", "", "sse")
 
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpeqpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpltpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmplepd, "V2dV2dV2d", "", "sse2")
@@ -251,7 +248,6 @@ TARGET_BUILTIN(__builtin_ia32_cmpneqpd,
 TARGET_BUILTIN(__builtin_ia32_cmpnltpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnlepd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpordpd, "V2dV2dV2d", "", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpeqsd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpltsd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmplesd, "V2dV2dV2d", "", "sse2")
@@ -453,8 +449,12 @@ TARGET_BUILTIN(__builtin_ia32_vpermilvar
 TARGET_BUILTIN(__builtin_ia32_blendvpd256, "V4dV4dV4dV4d", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_blendvps256, "V8fV8fV8fV8f", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmpps256, "V8fV8fV8fIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cvtdq2ps256, "V8fV8i", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cvtpd2ps256, "V4fV4d", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cvtps2dq256, "V8iV8f", "", "avx")

Modified: cfe/trunk/test/CodeGen/target-features-error-2.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/target-features-error-2.c?rev=273311&r1=273310&r2=273311&view=diff
==
--- cfe/trunk/test/CodeGen/target-features-error-2.c (original)
+++ cfe/trunk/test/CodeGen/target-features-error-2.c Tue Jun 21 15:22:55 2016
@@ -1,7 +1,38 @@
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o -
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_SSE42
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_1
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+
 #define __MM_MALL

r272933 - [x86] generate IR for AVX2 integer min/max builtins

2016-06-16 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Thu Jun 16 13:45:01 2016
New Revision: 272933

URL: http://llvm.org/viewvc/llvm-project?rev=272933&view=rev
Log:
[x86] generate IR for AVX2 integer min/max builtins
Sibling patch to r272932:
http://reviews.llvm.org/rL272932

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/avx2-builtins.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=272933&r1=272932&r2=272933&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Thu Jun 16 13:45:01 2016
@@ -6826,28 +6826,40 @@ Value *CodeGenFunction::EmitX86BuiltinEx
   case X86::BI__builtin_ia32_pcmpgtq512_mask:
 return EmitX86MaskedCompare(*this, ICmpInst::ICMP_SGT, Ops);
 
-  // TODO: Handle 64/256/512-bit vector widths of min/max.
+  // TODO: Handle 64/512-bit vector widths of min/max.
   case X86::BI__builtin_ia32_pmaxsb128:
   case X86::BI__builtin_ia32_pmaxsw128:
-  case X86::BI__builtin_ia32_pmaxsd128: {
+  case X86::BI__builtin_ia32_pmaxsd128:
+  case X86::BI__builtin_ia32_pmaxsb256:
+  case X86::BI__builtin_ia32_pmaxsw256:
+  case X86::BI__builtin_ia32_pmaxsd256: {
 Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_SGT, Ops[0], Ops[1]);
 return Builder.CreateSelect(Cmp, Ops[0], Ops[1]);
   }
   case X86::BI__builtin_ia32_pmaxub128:
   case X86::BI__builtin_ia32_pmaxuw128:
-  case X86::BI__builtin_ia32_pmaxud128: {
+  case X86::BI__builtin_ia32_pmaxud128:
+  case X86::BI__builtin_ia32_pmaxub256:
+  case X86::BI__builtin_ia32_pmaxuw256:
+  case X86::BI__builtin_ia32_pmaxud256: {
 Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_UGT, Ops[0], Ops[1]);
 return Builder.CreateSelect(Cmp, Ops[0], Ops[1]);
   }
   case X86::BI__builtin_ia32_pminsb128:
   case X86::BI__builtin_ia32_pminsw128:
-  case X86::BI__builtin_ia32_pminsd128: {
+  case X86::BI__builtin_ia32_pminsd128:
+  case X86::BI__builtin_ia32_pminsb256:
+  case X86::BI__builtin_ia32_pminsw256:
+  case X86::BI__builtin_ia32_pminsd256: {
 Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_SLT, Ops[0], Ops[1]);
 return Builder.CreateSelect(Cmp, Ops[0], Ops[1]);
   }
   case X86::BI__builtin_ia32_pminub128:
   case X86::BI__builtin_ia32_pminuw128:
-  case X86::BI__builtin_ia32_pminud128: {
+  case X86::BI__builtin_ia32_pminud128:
+  case X86::BI__builtin_ia32_pminub256:
+  case X86::BI__builtin_ia32_pminuw256:
+  case X86::BI__builtin_ia32_pminud256: {
 Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_ULT, Ops[0], Ops[1]);
 return Builder.CreateSelect(Cmp, Ops[0], Ops[1]);
   }

Modified: cfe/trunk/test/CodeGen/avx2-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx2-builtins.c?rev=272933&r1=272932&r2=272933&view=diff
==
--- cfe/trunk/test/CodeGen/avx2-builtins.c (original)
+++ cfe/trunk/test/CodeGen/avx2-builtins.c Thu Jun 16 13:45:01 2016
@@ -717,73 +717,85 @@ void test_mm256_maskstore_epi64(long lon
 
 __m256i test_mm256_max_epi8(__m256i a, __m256i b) {
   // CHECK-LABEL: test_mm256_max_epi8
-  // CHECK: call <32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8> %{{.*}}, <32 x i8> 
%{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp sgt <32 x i8> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <32 x i1> [[CMP]], <32 x i8> [[X]], <32 x i8> [[Y]]
   return _mm256_max_epi8(a, b);
 }
 
 __m256i test_mm256_max_epi16(__m256i a, __m256i b) {
   // CHECK-LABEL: test_mm256_max_epi16
-  // CHECK: call <16 x i16> @llvm.x86.avx2.pmaxs.w(<16 x i16> %{{.*}}, <16 x 
i16> %{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp sgt <16 x i16> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <16 x i1> [[CMP]], <16 x i16> [[X]], <16 x i16> [[Y]]
   return _mm256_max_epi16(a, b);
 }
 
 __m256i test_mm256_max_epi32(__m256i a, __m256i b) {
   // CHECK-LABEL: test_mm256_max_epi32
-  // CHECK: call <8 x i32> @llvm.x86.avx2.pmaxs.d(<8 x i32> %{{.*}}, <8 x i32> 
%{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp sgt <8 x i32> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <8 x i1> [[CMP]], <8 x i32> [[X]], <8 x i32> [[Y]]
   return _mm256_max_epi32(a, b);
 }
 
 __m256i test_mm256_max_epu8(__m256i a, __m256i b) {
   // CHECK-LABEL: test_mm256_max_epu8
-  // CHECK: call <32 x i8> @llvm.x86.avx2.pmaxu.b(<32 x i8> %{{.*}}, <32 x i8> 
%{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp ugt <32 x i8> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <32 x i1> [[CMP]], <32 x i8> [[X]], <32 x i8> [[Y]]
   return _mm256_max_epu8(a, b);
 }
 
 __m256i test_mm256_max_epu16(__m256i a, __m256i b) {
   // CHECK-LABEL: test_mm256_max_epu16
-  // CHECK: call <16 x i16> @llvm.x86.avx2.pmaxu.w(<16 x i16> %{{.*}}, <16 x 
i16> %{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp ugt <16 x i16> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <16 x i1> [[CMP]], <16 x i16> [[X]], <16 x i16> [[Y]]
   return _mm256_max_epu16(a,

Re: [PATCH] D21268: [x86] translate SSE packed FP comparison builtins to IR

2016-06-15 Thread Sanjay Patel via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL272840: [x86] translate SSE packed FP comparison builtins to 
IR (authored by spatel).

Changed prior to commit:
  http://reviews.llvm.org/D21268?vs=60473&id=60905#toc

Repository:
  rL LLVM

http://reviews.llvm.org/D21268

Files:
  cfe/trunk/lib/CodeGen/CGBuiltin.cpp
  cfe/trunk/test/CodeGen/avx2-builtins.c
  cfe/trunk/test/CodeGen/sse-builtins.c
  cfe/trunk/test/CodeGen/sse2-builtins.c

Index: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
===
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp
@@ -6419,6 +6419,36 @@
 Ops.push_back(llvm::ConstantInt::get(getLLVMContext(), Result));
   }
 
+  // These exist so that the builtin that takes an immediate can be bounds
+  // checked by clang to avoid passing bad immediates to the backend. Since
+  // AVX has a larger immediate than SSE we would need separate builtins to
+  // do the different bounds checking. Rather than create a clang specific
+  // SSE only builtin, this implements eight separate builtins to match gcc
+  // implementation.
+  auto getCmpIntrinsicCall = [this, &Ops](Intrinsic::ID ID, unsigned Imm) {
+Ops.push_back(llvm::ConstantInt::get(Int8Ty, Imm));
+llvm::Function *F = CGM.getIntrinsic(ID);
+return Builder.CreateCall(F, Ops);
+  };
+
+  // For the vector forms of FP comparisons, translate the builtins directly to
+  // IR.
+  // TODO: The builtins could be removed if the SSE header files used vector
+  // extension comparisons directly (vector ordered/unordered may need
+  // additional support via __builtin_isnan()).
+  llvm::VectorType *V2F64 =
+  llvm::VectorType::get(llvm::Type::getDoubleTy(getLLVMContext()), 2);
+  llvm::VectorType *V4F32 =
+  llvm::VectorType::get(llvm::Type::getFloatTy(getLLVMContext()), 4);
+
+  auto getVectorFCmpIR = [this, &Ops](CmpInst::Predicate Pred,
+  llvm::VectorType *FPVecTy) {
+Value *Cmp = Builder.CreateFCmp(Pred, Ops[0], Ops[1]);
+llvm::VectorType *IntVecTy = llvm::VectorType::getInteger(FPVecTy);
+Value *Sext = Builder.CreateSExt(Cmp, IntVecTy);
+return Builder.CreateBitCast(Sext, FPVecTy);
+  };
+
   switch (BuiltinID) {
   default: return nullptr;
   case X86::BI__builtin_cpu_supports: {
@@ -6857,154 +6887,74 @@
   Ops[0]);
 return Builder.CreateExtractValue(Call, 1);
   }
-  // SSE comparison intrisics
+
+  // SSE packed comparison intrinsics
   case X86::BI__builtin_ia32_cmpeqps:
+return getVectorFCmpIR(CmpInst::FCMP_OEQ, V4F32);
   case X86::BI__builtin_ia32_cmpltps:
+return getVectorFCmpIR(CmpInst::FCMP_OLT, V4F32);
   case X86::BI__builtin_ia32_cmpleps:
+return getVectorFCmpIR(CmpInst::FCMP_OLE, V4F32);
   case X86::BI__builtin_ia32_cmpunordps:
+return getVectorFCmpIR(CmpInst::FCMP_UNO, V4F32);
   case X86::BI__builtin_ia32_cmpneqps:
+return getVectorFCmpIR(CmpInst::FCMP_UNE, V4F32);
   case X86::BI__builtin_ia32_cmpnltps:
+return getVectorFCmpIR(CmpInst::FCMP_UGE, V4F32);
   case X86::BI__builtin_ia32_cmpnleps:
+return getVectorFCmpIR(CmpInst::FCMP_UGT, V4F32);
   case X86::BI__builtin_ia32_cmpordps:
-  case X86::BI__builtin_ia32_cmpeqss:
-  case X86::BI__builtin_ia32_cmpltss:
-  case X86::BI__builtin_ia32_cmpless:
-  case X86::BI__builtin_ia32_cmpunordss:
-  case X86::BI__builtin_ia32_cmpneqss:
-  case X86::BI__builtin_ia32_cmpnltss:
-  case X86::BI__builtin_ia32_cmpnless:
-  case X86::BI__builtin_ia32_cmpordss:
+return getVectorFCmpIR(CmpInst::FCMP_ORD, V4F32);
   case X86::BI__builtin_ia32_cmpeqpd:
+return getVectorFCmpIR(CmpInst::FCMP_OEQ, V2F64);
   case X86::BI__builtin_ia32_cmpltpd:
+return getVectorFCmpIR(CmpInst::FCMP_OLT, V2F64);
   case X86::BI__builtin_ia32_cmplepd:
+return getVectorFCmpIR(CmpInst::FCMP_OLE, V2F64);
   case X86::BI__builtin_ia32_cmpunordpd:
+return getVectorFCmpIR(CmpInst::FCMP_UNO, V2F64);
   case X86::BI__builtin_ia32_cmpneqpd:
+return getVectorFCmpIR(CmpInst::FCMP_UNE, V2F64);
   case X86::BI__builtin_ia32_cmpnltpd:
+return getVectorFCmpIR(CmpInst::FCMP_UGE, V2F64);
   case X86::BI__builtin_ia32_cmpnlepd:
+return getVectorFCmpIR(CmpInst::FCMP_UGT, V2F64);
   case X86::BI__builtin_ia32_cmpordpd:
+return getVectorFCmpIR(CmpInst::FCMP_ORD, V2F64);
+
+  // SSE scalar comparison intrinsics
+  case X86::BI__builtin_ia32_cmpeqss:
+return getCmpIntrinsicCall(Intrinsic::x86_sse_cmp_ss, 0);
+  case X86::BI__builtin_ia32_cmpltss:
+return getCmpIntrinsicCall(Intrinsic::x86_sse_cmp_ss, 1);
+  case X86::BI__builtin_ia32_cmpless:
+return getCmpIntrinsicCall(Intrinsic::x86_sse_cmp_ss, 2);
+  case X86::BI__builtin_ia32_cmpunordss:
+return getCmpIntrinsicCall(Intrinsic::x86_sse_cmp_ss, 3);
+  case X86::BI__builtin_ia32_cmpneqss:
+return getCmpIntrinsicCall(Intrinsic::x86_sse_cmp_ss, 4);
+  case X86::BI__builtin_ia

r272840 - [x86] translate SSE packed FP comparison builtins to IR

2016-06-15 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Wed Jun 15 16:20:04 2016
New Revision: 272840

URL: http://llvm.org/viewvc/llvm-project?rev=272840&view=rev
Log:
[x86] translate SSE packed FP comparison builtins to IR

As noted in the code comment, a potential follow-on would be to remove
the builtins themselves. Other than ord/unord, this already works as 
expected. Eg:

  typedef float v4sf __attribute__((__vector_size__(16)));
  v4sf fcmpgt(v4sf a, v4sf b) { return a > b; }

Differential Revision: http://reviews.llvm.org/D21268

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/avx2-builtins.c
cfe/trunk/test/CodeGen/sse-builtins.c
cfe/trunk/test/CodeGen/sse2-builtins.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=272840&r1=272839&r2=272840&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Wed Jun 15 16:20:04 2016
@@ -6419,6 +6419,36 @@ Value *CodeGenFunction::EmitX86BuiltinEx
 Ops.push_back(llvm::ConstantInt::get(getLLVMContext(), Result));
   }
 
+  // These exist so that the builtin that takes an immediate can be bounds
+  // checked by clang to avoid passing bad immediates to the backend. Since
+  // AVX has a larger immediate than SSE we would need separate builtins to
+  // do the different bounds checking. Rather than create a clang specific
+  // SSE only builtin, this implements eight separate builtins to match gcc
+  // implementation.
+  auto getCmpIntrinsicCall = [this, &Ops](Intrinsic::ID ID, unsigned Imm) {
+Ops.push_back(llvm::ConstantInt::get(Int8Ty, Imm));
+llvm::Function *F = CGM.getIntrinsic(ID);
+return Builder.CreateCall(F, Ops);
+  };
+
+  // For the vector forms of FP comparisons, translate the builtins directly to
+  // IR.
+  // TODO: The builtins could be removed if the SSE header files used vector
+  // extension comparisons directly (vector ordered/unordered may need
+  // additional support via __builtin_isnan()).
+  llvm::VectorType *V2F64 =
+  llvm::VectorType::get(llvm::Type::getDoubleTy(getLLVMContext()), 2);
+  llvm::VectorType *V4F32 =
+  llvm::VectorType::get(llvm::Type::getFloatTy(getLLVMContext()), 4);
+
+  auto getVectorFCmpIR = [this, &Ops](CmpInst::Predicate Pred,
+  llvm::VectorType *FPVecTy) {
+Value *Cmp = Builder.CreateFCmp(Pred, Ops[0], Ops[1]);
+llvm::VectorType *IntVecTy = llvm::VectorType::getInteger(FPVecTy);
+Value *Sext = Builder.CreateSExt(Cmp, IntVecTy);
+return Builder.CreateBitCast(Sext, FPVecTy);
+  };
+
   switch (BuiltinID) {
   default: return nullptr;
   case X86::BI__builtin_cpu_supports: {
@@ -6857,154 +6887,74 @@ Value *CodeGenFunction::EmitX86BuiltinEx
   Ops[0]);
 return Builder.CreateExtractValue(Call, 1);
   }
-  // SSE comparison intrisics
+
+  // SSE packed comparison intrinsics
   case X86::BI__builtin_ia32_cmpeqps:
+return getVectorFCmpIR(CmpInst::FCMP_OEQ, V4F32);
   case X86::BI__builtin_ia32_cmpltps:
+return getVectorFCmpIR(CmpInst::FCMP_OLT, V4F32);
   case X86::BI__builtin_ia32_cmpleps:
+return getVectorFCmpIR(CmpInst::FCMP_OLE, V4F32);
   case X86::BI__builtin_ia32_cmpunordps:
+return getVectorFCmpIR(CmpInst::FCMP_UNO, V4F32);
   case X86::BI__builtin_ia32_cmpneqps:
+return getVectorFCmpIR(CmpInst::FCMP_UNE, V4F32);
   case X86::BI__builtin_ia32_cmpnltps:
+return getVectorFCmpIR(CmpInst::FCMP_UGE, V4F32);
   case X86::BI__builtin_ia32_cmpnleps:
+return getVectorFCmpIR(CmpInst::FCMP_UGT, V4F32);
   case X86::BI__builtin_ia32_cmpordps:
-  case X86::BI__builtin_ia32_cmpeqss:
-  case X86::BI__builtin_ia32_cmpltss:
-  case X86::BI__builtin_ia32_cmpless:
-  case X86::BI__builtin_ia32_cmpunordss:
-  case X86::BI__builtin_ia32_cmpneqss:
-  case X86::BI__builtin_ia32_cmpnltss:
-  case X86::BI__builtin_ia32_cmpnless:
-  case X86::BI__builtin_ia32_cmpordss:
+return getVectorFCmpIR(CmpInst::FCMP_ORD, V4F32);
   case X86::BI__builtin_ia32_cmpeqpd:
+return getVectorFCmpIR(CmpInst::FCMP_OEQ, V2F64);
   case X86::BI__builtin_ia32_cmpltpd:
+return getVectorFCmpIR(CmpInst::FCMP_OLT, V2F64);
   case X86::BI__builtin_ia32_cmplepd:
+return getVectorFCmpIR(CmpInst::FCMP_OLE, V2F64);
   case X86::BI__builtin_ia32_cmpunordpd:
+return getVectorFCmpIR(CmpInst::FCMP_UNO, V2F64);
   case X86::BI__builtin_ia32_cmpneqpd:
+return getVectorFCmpIR(CmpInst::FCMP_UNE, V2F64);
   case X86::BI__builtin_ia32_cmpnltpd:
+return getVectorFCmpIR(CmpInst::FCMP_UGE, V2F64);
   case X86::BI__builtin_ia32_cmpnlepd:
+return getVectorFCmpIR(CmpInst::FCMP_UGT, V2F64);
   case X86::BI__builtin_ia32_cmpordpd:
+return getVectorFCmpIR(CmpInst::FCMP_ORD, V2F64);
+
+  // SSE scalar comparison intrinsics
+  case X86::BI__builtin_ia32_cmpeqss:
+return getCmpIntrinsicCall(Intrinsic::x86_

r272807 - [x86] generate IR for SSE integer min/max builtins

2016-06-15 Thread Sanjay Patel via cfe-commits
Author: spatel
Date: Wed Jun 15 12:18:50 2016
New Revision: 272807

URL: http://llvm.org/viewvc/llvm-project?rev=272807&view=rev
Log:
[x86] generate IR for SSE integer min/max builtins
Sibling patch to r272806:
http://reviews.llvm.org/rL272806

Modified:
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/test/CodeGen/sse2-builtins.c
cfe/trunk/test/CodeGen/sse41-builtins.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=272807&r1=272806&r2=272807&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Wed Jun 15 12:18:50 2016
@@ -6788,6 +6788,33 @@ Value *CodeGenFunction::EmitX86BuiltinEx
   case X86::BI__builtin_ia32_pcmpgtq256_mask:
   case X86::BI__builtin_ia32_pcmpgtq512_mask:
 return EmitX86MaskedCompare(*this, ICmpInst::ICMP_SGT, Ops);
+
+  // TODO: Handle 64/256/512-bit vector widths of min/max.
+  case X86::BI__builtin_ia32_pmaxsb128:
+  case X86::BI__builtin_ia32_pmaxsw128:
+  case X86::BI__builtin_ia32_pmaxsd128: {
+Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_SGT, Ops[0], Ops[1]);
+return Builder.CreateSelect(Cmp, Ops[0], Ops[1]);
+  }
+  case X86::BI__builtin_ia32_pmaxub128:
+  case X86::BI__builtin_ia32_pmaxuw128:
+  case X86::BI__builtin_ia32_pmaxud128: {
+Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_UGT, Ops[0], Ops[1]);
+return Builder.CreateSelect(Cmp, Ops[0], Ops[1]);
+  }
+  case X86::BI__builtin_ia32_pminsb128:
+  case X86::BI__builtin_ia32_pminsw128:
+  case X86::BI__builtin_ia32_pminsd128: {
+Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_SLT, Ops[0], Ops[1]);
+return Builder.CreateSelect(Cmp, Ops[0], Ops[1]);
+  }
+  case X86::BI__builtin_ia32_pminub128:
+  case X86::BI__builtin_ia32_pminuw128:
+  case X86::BI__builtin_ia32_pminud128: {
+Value *Cmp = Builder.CreateICmp(ICmpInst::ICMP_ULT, Ops[0], Ops[1]);
+return Builder.CreateSelect(Cmp, Ops[0], Ops[1]);
+  }
+
   // 3DNow!
   case X86::BI__builtin_ia32_pswapdsf:
   case X86::BI__builtin_ia32_pswapdsi: {

Modified: cfe/trunk/test/CodeGen/sse2-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/sse2-builtins.c?rev=272807&r1=272806&r2=272807&view=diff
==
--- cfe/trunk/test/CodeGen/sse2-builtins.c (original)
+++ cfe/trunk/test/CodeGen/sse2-builtins.c Wed Jun 15 12:18:50 2016
@@ -679,13 +679,15 @@ void test_mm_maskmoveu_si128(__m128i A,
 
 __m128i test_mm_max_epi16(__m128i A, __m128i B) {
   // CHECK-LABEL: test_mm_max_epi16
-  // CHECK: call <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16> %{{.*}}, <8 x i16> 
%{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp sgt <8 x i16> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <8 x i1> [[CMP]], <8 x i16> [[X]], <8 x i16> [[Y]]
   return _mm_max_epi16(A, B);
 }
 
 __m128i test_mm_max_epu8(__m128i A, __m128i B) {
   // CHECK-LABEL: test_mm_max_epu8
-  // CHECK: call <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8> %{{.*}}, <16 x i8> 
%{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp ugt <16 x i8> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <16 x i1> [[CMP]], <16 x i8> [[X]], <16 x i8> [[Y]]
   return _mm_max_epu8(A, B);
 }
 
@@ -709,13 +711,15 @@ void test_mm_mfence() {
 
 __m128i test_mm_min_epi16(__m128i A, __m128i B) {
   // CHECK-LABEL: test_mm_min_epi16
-  // CHECK: call <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16> %{{.*}}, <8 x i16> 
%{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp slt <8 x i16> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <8 x i1> [[CMP]], <8 x i16> [[X]], <8 x i16> [[Y]]
   return _mm_min_epi16(A, B);
 }
 
 __m128i test_mm_min_epu8(__m128i A, __m128i B) {
   // CHECK-LABEL: test_mm_min_epu8
-  // CHECK: call <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8> %{{.*}}, <16 x i8> 
%{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp ult <16 x i8> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <16 x i1> [[CMP]], <16 x i8> [[X]], <16 x i8> [[Y]]
   return _mm_min_epu8(A, B);
 }
 

Modified: cfe/trunk/test/CodeGen/sse41-builtins.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/sse41-builtins.c?rev=272807&r1=272806&r2=272807&view=diff
==
--- cfe/trunk/test/CodeGen/sse41-builtins.c (original)
+++ cfe/trunk/test/CodeGen/sse41-builtins.c Wed Jun 15 12:18:50 2016
@@ -245,49 +245,57 @@ __m128 test_mm_insert_ps(__m128 x, __m12
 
 __m128i test_mm_max_epi8(__m128i x, __m128i y) {
   // CHECK-LABEL: test_mm_max_epi8
-  // CHECK: call <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8> %{{.*}}, <16 x i8> 
%{{.*}})
+  // CHECK:   [[CMP:%.*]] = icmp sgt <16 x i8> [[X:%.*]], [[Y:%.*]]
+  // CHECK-NEXT:  select <16 x i1> [[CMP]], <16 x i8> [[X]], <16 x i8> [[Y]]
   return _mm_max_epi8(x, y);
 }
 
 __m128i test_mm_max_epi32(__m128i x, __m128i y) {
   // CHECK-LABEL: test_mm_max_epi32
-  // CH

Re: [PATCH] D21306: [x86] AVX FP compare builtins should require AVX target feature (PR28112)

2016-06-13 Thread Sanjay Patel via cfe-commits
spatel added a comment.

In http://reviews.llvm.org/D21306#456965, @echristo wrote:

> The 128 bit versions should only be selecting for sse functions and shouldn't 
> need avx to work? What instructions are getting emitted here?


No, the 128-bit versions of these C intrinsics are strictly for AVX versions of 
the instructions (eg, vcmpps).

Example from the bug report:

  cmp = _mm_cmp_ps((__m128)a, (__m128)b, 8);

We're currently emitting illegal SSE instructions like:

  cmpps $0x8, %xmm1, %xmm0  <--- anything over '7' is reserved/undef


http://reviews.llvm.org/D21306



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D21306: [x86] AVX FP compare builtins should require AVX target feature (PR28112)

2016-06-13 Thread Sanjay Patel via cfe-commits
spatel created this revision.
spatel added reviewers: echristo, bogner, RKSimon.
spatel added a subscriber: cfe-commits.
Herald added subscribers: mehdi_amini, mcrosier.

This is a fix for PR28112:
https://llvm.org/bugs/show_bug.cgi?id=28112

The FP comparison intrinsics that take an immediate parameter rather than 
specifying a comparison predicate in the function name were added with AVX 
(these are macros in avxintrin.h). This makes clang behave more like like gcc 
and matches the Intel documentation, eg:
VCMPPS: __m128 _mm_cmp_ps(__m128 a, __m128 b, const int imm)

'V' means this is intended to only work with the AVX form of the instruction.

http://reviews.llvm.org/D21306

Files:
  include/clang/Basic/BuiltinsX86.def
  test/CodeGen/target-features-error-2.c

Index: test/CodeGen/target-features-error-2.c
===
--- test/CodeGen/target-features-error-2.c
+++ test/CodeGen/target-features-error-2.c
@@ -1,7 +1,38 @@
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o -
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_SSE42
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_1
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+
 #define __MM_MALLOC_H
 #include 
 
+#if NEED_SSE42
 int baz(__m256i a) {
   return _mm256_extract_epi32(a, 3); // expected-error {{always_inline 
function '_mm256_extract_epi32' requires target feature 'sse4.2', but would be 
inlined into function 'baz' that is compiled without support for 'sse4.2'}}
 }
+#endif
+
+#if NEED_AVX_1
+__m128 need_avx(__m128 a, __m128 b) {
+  return _mm_cmp_ps(a, b, 0); // expected-error {{'__builtin_ia32_cmpps' needs 
target feature avx}}
+}
+#endif
+
+#if NEED_AVX_2
+__m128 need_avx(__m128 a, __m128 b) {
+  return _mm_cmp_ss(a, b, 0); // expected-error {{'__builtin_ia32_cmpss' needs 
target feature avx}}
+}
+#endif
+
+#if NEED_AVX_3
+__m128d need_avx(__m128d a, __m128d b) {
+  return _mm_cmp_pd(a, b, 0); // expected-error {{'__builtin_ia32_cmppd' needs 
target feature avx}}
+}
+#endif
+
+#if NEED_AVX_4
+__m128d need_avx(__m128d a, __m128d b) {
+  return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs 
target feature avx}}
+}
+#endif
Index: include/clang/Basic/BuiltinsX86.def
===
--- include/clang/Basic/BuiltinsX86.def
+++ include/clang/Basic/BuiltinsX86.def
@@ -219,16 +219,14 @@
 TARGET_BUILTIN(__builtin_ia32_ucomisdge, "iV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_ucomisdneq, "iV2dV2d", "", "sse2")
 
-TARGET_BUILTIN(__builtin_ia32_cmpps, "V4fV4fV4fIc", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpeqps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpltps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpleps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpunordps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpneqps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpnltps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpnleps, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpordps, "V4fV4fV4f", "", "sse")
-TARGET_BUILTIN(__builtin_ia32_cmpss, "V4fV4fV4fIc", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpeqss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpltss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_cmpless, "V4fV4fV4f", "", "sse")
@@ -242,16 +240,14 @@
 TARGET_BUILTIN(__builtin_ia32_minss, "V4fV4fV4f", "", "sse")
 TARGET_BUILTIN(__builtin_ia32_maxss, "V4fV4fV4f", "", "sse")
 
-TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpeqpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpltpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmplepd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpunordpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpneqpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnltpd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpnlepd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpordpd, "V2dV2dV2d", "", "sse2")
-TARGET_BUILTIN(__builtin_ia32_cmpsd, "V2dV2dV2dIc", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpeqsd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmpltsd, "V2dV2dV2d", "", "sse2")
 TARGET_BUILTIN(__builtin_ia32_cmplesd, "V2dV2dV2d", "", "sse2")
@@ -453,8 +449,12 @@
 TARGET_BUILTIN(__builtin_ia32_blendvpd256, "V4dV4dV4dV4d", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_blendvps256, "V8fV8fV8fV8f", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_dpps256, "V8fV8fV8fIc", "", "avx")
+TARGET_BUILTIN(__builtin_ia32_cmppd, "V2dV2dV2dIc", "", "avx")
 TARGET_BUILTIN(__builtin_ia32_cmppd256, "V4dV4dV4dIc", "", "avx")
+TARGET

Re: [PATCH] D21268: [x86] translate SSE packed FP comparison builtins to IR

2016-06-13 Thread Sanjay Patel via cfe-commits
spatel added a comment.

In http://reviews.llvm.org/D21268#455679, @RKSimon wrote:

> Eeep that's certainly a lot more work than just adding a few extra cases! 
> Please add a TODO explaining what we need to do?


I don't know what the answer is yet...looks like this is going to require (a 
lot of) testing to sort out.

> If there is a problem with the header documentation please can you raise a 
> bugzilla and CC Katya Romanova.


Filed PR28110:
https://llvm.org/bugs/show_bug.cgi?id=28110

The initial test says that AMD's documentation is wrong: cmpps with immediate 
'8' produces a different answer than immediate '0' running on Jaguar.


http://reviews.llvm.org/D21268



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21268: [x86] translate SSE packed FP comparison builtins to IR

2016-06-12 Thread Sanjay Patel via cfe-commits
spatel added a comment.

In http://reviews.llvm.org/D21268#455668, @RKSimon wrote:

> Is there any reason that we shouldn't include the avxintrin.h 
> __builtin_ia32_cmppd/__builtin_ia32_cmpps/__builtin_ia32_cmppd256/__builtin_ia32_cmpps256
>  packed intrinsics in this CGBuiltin.cpp patch? Since we're heading towards 
> nixing them anyhow.


AVX is complicated by the enhancement to 32 compare ops (for Intel AVX). Note 
that avxintrin.h currently has conflicting comments about the immediate value 
meanings:

/* Compare */
#define _CMP_EQ_OQ0x00 /* Equal (ordered, non-signaling)  */
#define _CMP_LT_OS0x01 /* Less-than (ordered, signaling)  */
#define _CMP_LE_OS0x02 /* Less-than-or-equal (ordered, signaling)  */
#define _CMP_UNORD_Q  0x03 /* Unordered (non-signaling)  */
#define _CMP_NEQ_UQ   0x04 /* Not-equal (unordered, non-signaling)  */
#define _CMP_NLT_US   0x05 /* Not-less-than (unordered, signaling)  */
#define _CMP_NLE_US   0x06 /* Not-less-than-or-equal (unordered, signaling)  */
#define _CMP_ORD_Q0x07 /* Ordered (nonsignaling)   */
#define _CMP_EQ_UQ0x08 /* Equal (unordered, non-signaling)  */
#define _CMP_NGE_US   0x09 /* Not-greater-than-or-equal (unord, signaling)  */
#define _CMP_NGT_US   0x0a /* Not-greater-than (unordered, signaling)  */
#define _CMP_FALSE_OQ 0x0b /* False (ordered, non-signaling)  */
#define _CMP_NEQ_OQ   0x0c /* Not-equal (ordered, non-signaling)  */
#define _CMP_GE_OS0x0d /* Greater-than-or-equal (ordered, signaling)  */
#define _CMP_GT_OS0x0e /* Greater-than (ordered, signaling)  */
#define _CMP_TRUE_UQ  0x0f /* True (unordered, non-signaling)  */
#define _CMP_EQ_OS0x10 /* Equal (ordered, signaling)  */
#define _CMP_LT_OQ0x11 /* Less-than (ordered, non-signaling)  */
#define _CMP_LE_OQ0x12 /* Less-than-or-equal (ordered, non-signaling)  */
#define _CMP_UNORD_S  0x13 /* Unordered (signaling)  */
#define _CMP_NEQ_US   0x14 /* Not-equal (unordered, signaling)  */
#define _CMP_NLT_UQ   0x15 /* Not-less-than (unordered, non-signaling)  */
#define _CMP_NLE_UQ   0x16 /* Not-less-than-or-equal (unord, non-signaling)  */
#define _CMP_ORD_S0x17 /* Ordered (signaling)  */
#define _CMP_EQ_US0x18 /* Equal (unordered, signaling)  */
#define _CMP_NGE_UQ   0x19 /* Not-greater-than-or-equal (unord, non-sign)  */
#define _CMP_NGT_UQ   0x1a /* Not-greater-than (unordered, non-signaling)  */
#define _CMP_FALSE_OS 0x1b /* False (ordered, signaling)  */
#define _CMP_NEQ_OS   0x1c /* Not-equal (ordered, signaling)  */
#define _CMP_GE_OQ0x1d /* Greater-than-or-equal (ordered, non-signaling)  */
#define _CMP_GT_OQ0x1e /* Greater-than (ordered, non-signaling)  */
#define _CMP_TRUE_US  0x1f /* True (unordered, signaling)  */

/// \brief Compares each of the corresponding double-precision values of two
///128-bit vectors of [2 x double], using the operation specified by the
///immediate integer operand. Returns a [2 x double] vector consisting of
///two doubles corresponding to the two comparison results: zero if the
///comparison is false, and all 1's if the comparison is true.
///
/// \headerfile 
///
/// \code
/// __m128d _mm_cmp_pd(__m128d a, __m128d b, const int c);
/// \endcode
///
/// This intrinsic corresponds to the \c VCMPPD / CMPPD instruction.
///
/// \param a
///A 128-bit vector of [2 x double].
/// \param b
///A 128-bit vector of [2 x double].
/// \param c
///An immediate integer operand, with bits [4:0] specifying which comparison
///operation to use:
///00h, 08h, 10h, 18h: Equal
///01h, 09h, 11h, 19h: Less than
///02h, 0Ah, 12h, 1Ah: Less than or equal / Greater than or equal (swapped
///operands)
///03h, 0Bh, 13h, 1Bh: Unordered
///04h, 0Ch, 14h, 1Ch: Not equal
///05h, 0Dh, 15h, 1Dh: Not less than / Not greater than (swapped operands)
///06h, 0Eh, 16h, 1Eh: Not less than or equal / Not greater than or equal
///(swapped operands)
///07h, 0Fh, 17h, 1Fh: Ordered
/// \returns A 128-bit vector of [2 x double] containing the comparison results.


http://reviews.llvm.org/D21268



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >