from:"Gabor Buella via Phabricator via cfe\-commits"

[PATCH] D48921: NFC - Typo fix in test/CodeGenCXX/runtime-dllstorage.cpp

2019-12-12 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In D48921#1781447 , @thakis wrote:

> This breaks the test everywhere, e.g. http://45.33.8.238/linux/5543/step_7.txt
>
> Can you revert this while you investigate if the test was broken since it 
> landed or if something broke it later, while it wasn't really testing what it 
> was supposed to test?


Yes, I reverted the change on line 137, I recall the other changes didn't cause 
trouble.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D48921/new/

https://reviews.llvm.org/D48921



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48921: NFC - Typo fix in test/CodeGenCXX/runtime-dllstorage.cpp

2019-12-12 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG9c48c2f9c477: [NFC] - Typo fix in 
test/CodeGenCXX/runtime-dllstorage.cpp (authored by GBuella).

Changed prior to commit:
  https://reviews.llvm.org/D48921?vs=154067&id=233543#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D48921/new/

https://reviews.llvm.org/D48921

Files:
  clang/test/CodeGenCXX/runtime-dllstorage.cpp


Index: clang/test/CodeGenCXX/runtime-dllstorage.cpp
===
--- clang/test/CodeGenCXX/runtime-dllstorage.cpp
+++ clang/test/CodeGenCXX/runtime-dllstorage.cpp
@@ -1,13 +1,13 @@
 // RUN: %clang_cc1 -triple i686-windows-msvc -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -emit-llvm -o - %s | FileCheck 
-allow-deprecated-dag-overlap %s -check-prefix CHECK-MS -check-prefix 
CHECK-MS-DYNAMIC
 // RUN: %clang_cc1 -triple i686-windows-msvc -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-emit-llvm -o - %s | FileCheck -allow-deprecated-dag-overlap %s -check-prefix 
CHECK-MS -check-prefix CHECK-MS-STATIC
 
-// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -emit-llvm -o - %s | FileCheck 
-allow-deprecated-dag-overlap %s -check-prefix CHECK-IA -check-prefix 
CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-NODECL-IA -check-prefix 
CHECK-DYANMIC-IA-CXA-ATEXIT
+// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -emit-llvm -o - %s | FileCheck 
-allow-deprecated-dag-overlap %s -check-prefix CHECK-IA -check-prefix 
CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-NODECL-IA -check-prefix 
CHECK-DYNAMIC-IA-CXA-ATEXIT
 // RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-emit-llvm -o - %s | FileCheck -allow-deprecated-dag-overlap %s -check-prefix 
CHECK-IA -check-prefix CHECK-STATIC-IA -check-prefix CHECK-STATIC-NODECL-IA 
-check-prefix CHECK-IA-STATIC-CXA-ATEXIT
-// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DIMPORT_DECLARATIONS 
-emit-llvm -o - %s | FileCheck -allow-deprecated-dag-overlap %s -check-prefix 
CHECK-IA -check-prefix CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-IMPORT-IA 
-check-prefix CHECK-DYANMIC-IA-CXA-ATEXIT
+// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DIMPORT_DECLARATIONS 
-emit-llvm -o - %s | FileCheck -allow-deprecated-dag-overlap %s -check-prefix 
CHECK-IA -check-prefix CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-IMPORT-IA 
-check-prefix CHECK-DYNAMIC-IA-CXA-ATEXIT
 // RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-DIMPORT_DECLARATIONS -emit-llvm -o - %s | FileCheck 
-allow-deprecated-dag-overlap %s -check-prefix CHECK-IA -check-prefix 
CHECK-STATIC-IA -check-prefix CHECK-STATIC-IMPORT-IA -check-prefix 
CHECK-IA-STATIC-CXA-ATEXIT
-// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DEXPORT_DECLARATIONS 
-emit-llvm -o - %s | FileCheck -allow-deprecated-dag-overlap %s -check-prefix 
CHECK-IA -check-prefix CHECK-DYNAMIC-IA -check-prefix 
CHECK-DYANMIC-IA-CXA-ATEXIT
+// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DEXPORT_DECLARATIONS 
-emit-llvm -o - %s | FileCheck -allow-deprecated-dag-overlap %s -check-prefix 
CHECK-IA -check-prefix CHECK-DYNAMIC-IA -check-prefix 
CHECK-DYNAMIC-IA-CXA-ATEXIT
 // RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-DEXPORT_DECLARATIONS -emit-llvm -o - %s | FileCheck 
-allow-deprecated-dag-overlap %s -check-prefix CHECK-IA -check-prefix 
CHECK-STATIC-IA -check-prefix CHECK-IA-STATIC-CXA-ATEXIT
-// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DDECL -emit-llvm -o - %s | 
FileCheck -allow-deprecated-dag-overlap %s -check-prefix CHECK-IA -check-prefix 
CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-DECL-IA -check-prefix 
CHECK-DYANMIC-IA-CXA-ATEXIT
+// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DDECL -emit-llvm -o - %s | 
FileCheck -allow-deprecated-dag-overlap %s -check-prefix CHECK-IA -check-prefix 
CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-DECL-IA -check-prefix 
CHECK-DYNAMIC-IA-CXA-ATEXIT
 // RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-publ

[PATCH] D48921: NFC - type fix in test/CodeGenCXX/runtime-dllstorage.cpp

2018-07-06 Thread Gabor Buella via Phabricator via cfe-commits

GBuella requested review of this revision.
GBuella added a comment.

Well, apparently the test fails with the typo fix.
There is no  `declare dllimport void @_ZSt9terminatev()` line that could be 
matched for `CHECK-DYNAMIC-IA-DAG`.


Repository:
  rC Clang

https://reviews.llvm.org/D48921



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48715: [X86] Fix some vector cmp builtins - TRUE/FALSE predicates

2018-07-05 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rC336355: [X86] Fix some vector cmp builtins - TRUE/FALSE 
predicates (authored by GBuella, committed by ).

Repository:
  rC Clang

https://reviews.llvm.org/D48715

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: lib/CodeGen/CGBuiltin.cpp
===
--- lib/CodeGen/CGBuiltin.cpp
+++ lib/CodeGen/CGBuiltin.cpp
@@ -10158,43 +10158,38 @@
 // e.g. both _CMP_GT_OS & _CMP_GT_OQ are translated to FCMP_OGT.
 FCmpInst::Predicate Pred;
 switch (CC) {
-case 0x00: Pred = FCmpInst::FCMP_OEQ; break;
-case 0x01: Pred = FCmpInst::FCMP_OLT; break;
-case 0x02: Pred = FCmpInst::FCMP_OLE; break;
-case 0x03: Pred = FCmpInst::FCMP_UNO; break;
-case 0x04: Pred = FCmpInst::FCMP_UNE; break;
-case 0x05: Pred = FCmpInst::FCMP_UGE; break;
-case 0x06: Pred = FCmpInst::FCMP_UGT; break;
-case 0x07: Pred = FCmpInst::FCMP_ORD; break;
-case 0x08: Pred = FCmpInst::FCMP_UEQ; break;
-case 0x09: Pred = FCmpInst::FCMP_ULT; break;
-case 0x0a: Pred = FCmpInst::FCMP_ULE; break;
-case 0x0c: Pred = FCmpInst::FCMP_ONE; break;
-case 0x0d: Pred = FCmpInst::FCMP_OGE; break;
-case 0x0e: Pred = FCmpInst::FCMP_OGT; break;
-case 0x10: Pred = FCmpInst::FCMP_OEQ; break;
-case 0x11: Pred = FCmpInst::FCMP_OLT; break;
-case 0x12: Pred = FCmpInst::FCMP_OLE; break;
-case 0x13: Pred = FCmpInst::FCMP_UNO; break;
-case 0x14: Pred = FCmpInst::FCMP_UNE; break;
-case 0x15: Pred = FCmpInst::FCMP_UGE; break;
-case 0x16: Pred = FCmpInst::FCMP_UGT; break;
-case 0x17: Pred = FCmpInst::FCMP_ORD; break;
-case 0x18: Pred = FCmpInst::FCMP_UEQ; break;
-case 0x19: Pred = FCmpInst::FCMP_ULT; break;
-case 0x1a: Pred = FCmpInst::FCMP_ULE; break;
-case 0x1c: Pred = FCmpInst::FCMP_ONE; break;
-case 0x1d: Pred = FCmpInst::FCMP_OGE; break;
-case 0x1e: Pred = FCmpInst::FCMP_OGT; break;
-// _CMP_TRUE_UQ, _CMP_TRUE_US produce -1,-1... vector
-// on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produce 0, 0...
-case 0x0b: // FALSE_OQ
-case 0x1b: // FALSE_OS
-  return llvm::Constant::getNullValue(ConvertType(E->getType()));
-case 0x0f: // TRUE_UQ
-case 0x1f: // TRUE_US
-  return llvm::Constant::getAllOnesValue(ConvertType(E->getType()));
-
+case 0x00: Pred = FCmpInst::FCMP_OEQ;   break;
+case 0x01: Pred = FCmpInst::FCMP_OLT;   break;
+case 0x02: Pred = FCmpInst::FCMP_OLE;   break;
+case 0x03: Pred = FCmpInst::FCMP_UNO;   break;
+case 0x04: Pred = FCmpInst::FCMP_UNE;   break;
+case 0x05: Pred = FCmpInst::FCMP_UGE;   break;
+case 0x06: Pred = FCmpInst::FCMP_UGT;   break;
+case 0x07: Pred = FCmpInst::FCMP_ORD;   break;
+case 0x08: Pred = FCmpInst::FCMP_UEQ;   break;
+case 0x09: Pred = FCmpInst::FCMP_ULT;   break;
+case 0x0a: Pred = FCmpInst::FCMP_ULE;   break;
+case 0x0b: Pred = FCmpInst::FCMP_FALSE; break;
+case 0x0c: Pred = FCmpInst::FCMP_ONE;   break;
+case 0x0d: Pred = FCmpInst::FCMP_OGE;   break;
+case 0x0e: Pred = FCmpInst::FCMP_OGT;   break;
+case 0x0f: Pred = FCmpInst::FCMP_TRUE;  break;
+case 0x10: Pred = FCmpInst::FCMP_OEQ;   break;
+case 0x11: Pred = FCmpInst::FCMP_OLT;   break;
+case 0x12: Pred = FCmpInst::FCMP_OLE;   break;
+case 0x13: Pred = FCmpInst::FCMP_UNO;   break;
+case 0x14: Pred = FCmpInst::FCMP_UNE;   break;
+case 0x15: Pred = FCmpInst::FCMP_UGE;   break;
+case 0x16: Pred = FCmpInst::FCMP_UGT;   break;
+case 0x17: Pred = FCmpInst::FCMP_ORD;   break;
+case 0x18: Pred = FCmpInst::FCMP_UEQ;   break;
+case 0x19: Pred = FCmpInst::FCMP_ULT;   break;
+case 0x1a: Pred = FCmpInst::FCMP_ULE;   break;
+case 0x1b: Pred = FCmpInst::FCMP_FALSE; break;
+case 0x1c: Pred = FCmpInst::FCMP_ONE;   break;
+case 0x1d: Pred = FCmpInst::FCMP_OGE;   break;
+case 0x1e: Pred = FCmpInst::FCMP_OGT;   break;
+case 0x1f: Pred = FCmpInst::FCMP_TRUE;  break;
 default: llvm_unreachable("Unhandled CC");
 }
 
Index: test/CodeGen/avx-builtins.c
===
--- test/CodeGen/avx-builtins.c
+++ test/CodeGen/avx-builtins.c
@@ -280,8 +280,7 @@
 
 __m256d test_mm256_cmp_pd_false_oq(__m256d a, __m256d b) {
   // CHECK-LABEL: test_mm256_cmp_pd_false_oq
-  // CHECK-NOT: call
-  // CHECK: ret <4 x double> zeroinitializer
+  // CHECK: fcmp false <4 x double> %{{.*}}, %{{.*}}
   return _mm256_cmp_pd(a, b, _CMP_FALSE_OQ);
 }
 
@@ -305,8 +304,7 @@
 
 __m256d test_mm256_cmp_pd_true_uq(__m256d a, __m256d b) {
   // CHECK-LABEL: test_mm256_cmp_pd_true_uq
-  // CHECK-NOT: call
-  // CHECK: ret <4 x double> 
+  // CHECK: fcmp true <4 x double> %{{.*}}, %{{.*}}
   return _mm256_cmp_pd(a, b, _CMP_TRUE_UQ);
 }
 
@@ -378,8 +376,7 @@
 
 __m256d test_mm256_c

[PATCH] D48715: [X86] Fix some vector cmp builtins - TRUE/FALSE predicates

2018-07-05 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

ping @spatel


https://reviews.llvm.org/D48715



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48715: [X86] Fix some vector cmp builtins - TRUE/FALSE predicates

2018-07-05 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 154216.
GBuella added a comment.

As suggested, I added test cases with all predicates (in r336346).


https://reviews.llvm.org/D48715

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: test/CodeGen/avx512vl-builtins.c
===
--- test/CodeGen/avx512vl-builtins.c
+++ test/CodeGen/avx512vl-builtins.c
@@ -1139,8 +1139,7 @@
 
 __mmask8 test_mm256_cmp_ps_mask_false_oq(__m256 a, __m256 b) {
   // CHECK-LABEL: test_mm256_cmp_ps_mask_false_oq
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
+  // CHECK: fcmp false <8 x float> %{{.*}}, %{{.*}}
   return _mm256_cmp_ps_mask(a, b, _CMP_FALSE_OQ);
 }
 
@@ -1164,8 +1163,7 @@
 
 __mmask8 test_mm256_cmp_ps_mask_true_uq(__m256 a, __m256 b) {
   // CHECK-LABEL: test_mm256_cmp_ps_mask_true_uq
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
+  // CHECK: fcmp true <8 x float> %{{.*}}, %{{.*}}
   return _mm256_cmp_ps_mask(a, b, _CMP_TRUE_UQ);
 }
 
@@ -1237,8 +1235,7 @@
 
 __mmask8 test_mm256_cmp_ps_mask_false_os(__m256 a, __m256 b) {
   // CHECK-LABEL: test_mm256_cmp_ps_mask_false_os
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
+  // CHECK: fcmp false <8 x float> %{{.*}}, %{{.*}}
   return _mm256_cmp_ps_mask(a, b, _CMP_FALSE_OS);
 }
 
@@ -1262,8 +1259,7 @@
 
 __mmask8 test_mm256_cmp_ps_mask_true_us(__m256 a, __m256 b) {
   // CHECK-LABEL: test_mm256_cmp_ps_mask_true_us
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
+  // CHECK: fcmp true <8 x float> %{{.*}}, %{{.*}}
   return _mm256_cmp_ps_mask(a, b, _CMP_TRUE_US);
 }
 
@@ -1346,8 +1342,8 @@
 
 __mmask8 test_mm256_mask_cmp_ps_mask_false_oq(__mmask8 m, __m256 a, __m256 b) {
   // CHECK-LABEL: test_mm256_mask_cmp_ps_mask_false_oq
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
+  // CHECK: [[CMP:%.*]] = fcmp false <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> [[CMP]], {{.*}}
   return _mm256_mask_cmp_ps_mask(m, a, b, _CMP_FALSE_OQ);
 }
 
@@ -1374,7 +1370,8 @@
 
 __mmask8 test_mm256_mask_cmp_ps_mask_true_uq(__mmask8 m, __m256 a, __m256 b) {
   // CHECK-LABEL: test_mm256_mask_cmp_ps_mask_true_uq
-  // FIXME
+  // CHECK: [[CMP:%.*]] = fcmp true <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> [[CMP]], {{.*}}
   return _mm256_mask_cmp_ps_mask(m, a, b, _CMP_TRUE_UQ);
 }
 
@@ -1457,8 +1454,8 @@
 
 __mmask8 test_mm256_mask_cmp_ps_mask_false_os(__mmask8 m, __m256 a, __m256 b) {
   // CHECK-LABEL: test_mm256_mask_cmp_ps_mask_false_os
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
+  // CHECK: [[CMP:%.*]] = fcmp false <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> [[CMP]], {{.*}}
   return _mm256_mask_cmp_ps_mask(m, a, b, _CMP_FALSE_OS);
 }
 
@@ -1485,7 +1482,8 @@
 
 __mmask8 test_mm256_mask_cmp_ps_mask_true_us(__mmask8 m, __m256 a, __m256 b) {
   // CHECK-LABEL: test_mm256_mask_cmp_ps_mask_true_us
-  // FIXME
+  // CHECK: [[CMP:%.*]] = fcmp true <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> [[CMP]], {{.*}}
   return _mm256_mask_cmp_ps_mask(m, a, b, _CMP_TRUE_US);
 }
 
@@ -1557,8 +1555,7 @@
 
 __mmask8 test_mm256_cmp_pd_mask_false_oq(__m256d a, __m256d b) {
   // CHECK-LABEL: test_mm256_cmp_pd_mask_false_oq
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
+  // CHECK: fcmp false <4 x double> %{{.*}}, %{{.*}}
   return _mm256_cmp_pd_mask(a, b, _CMP_FALSE_OQ);
 }
 
@@ -1582,8 +1579,7 @@
 
 __mmask8 test_mm256_cmp_pd_mask_true_uq(__m256d a, __m256d b) {
   // CHECK-LABEL: test_mm256_cmp_pd_mask_true_uq
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
+  // CHECK: fcmp true <4 x double> %{{.*}}, %{{.*}}
   return _mm256_cmp_pd_mask(a, b, _CMP_TRUE_UQ);
 }
 
@@ -1655,8 +1651,7 @@
 
 __mmask8 test_mm256_cmp_pd_mask_false_os(__m256d a, __m256d b) {
   // CHECK-LABEL: test_mm256_cmp_pd_mask_false_os
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
+  // CHECK: fcmp false <4 x double> %{{.*}}, %{{.*}}
   return _mm256_cmp_pd_mask(a, b, _CMP_FALSE_OS);
 }
 
@@ -1680,8 +1675,7 @@
 
 __mmask8 test_mm256_cmp_pd_mask_true_us(__m256d a, __m256d b) {
   // CHECK-LABEL: test_mm256_cmp_pd_mask_true_us
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
+  // CHECK: fcmp true <4 x double> %{{.*}}, %{{.*}}
   return _mm256_cmp_pd_mask(a, b, _CMP_TRUE_US);
 }
 
@@ -1764,8 +1758,8 @@
 
 __mmask8 test_mm256_mask_cmp_pd_mask_false_oq(__mmask8 m, __m256d a, __m256d b) {
   // CHECK-LABEL: test_mm256_mask_cmp_pd_mask_false_oq
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
+  // CHECK: [[CMP:%.*]] = fcmp false <4 x double> %{{.*}}, %{{.*}}
+  // CHECK: and <4 x i1> [[CMP]], {{.*}}
   return _mm256_mask_cmp_pd_mask(m, a, b, _CMP_FALSE_OQ);
 }
 
@@ -1792,7 +1786,8 @@
 
 __mmask8 test_mm256_mask_cmp_pd_mask_true_uq(__mmask8 m, __m256d a, __m256d b) {
   // CHECK-LABEL: test_mm256_mask_cmp_pd_mask_true_uq
-  // FIXME
+  // CHECK: [[CMP:%.*]] = fcmp true <4 x double> %{{.*}}, %{{.*}}
+  // CHECK: and <4 x i1> [[CMP]], {{.*}}
   return _mm256_mask_cmp_pd_mask(m, a, b, _CMP_TRUE_UQ);
 }
 
@@ -18

[PATCH] D48921: NFC - type fix in test/CodeGenCXX/runtime-dllstorage.cpp

2018-07-04 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added reviewers: compnerd, espindola.
Herald added a subscriber: cfe-commits.

Repository:
  rC Clang

https://reviews.llvm.org/D48921

Files:
  test/CodeGenCXX/runtime-dllstorage.cpp


Index: test/CodeGenCXX/runtime-dllstorage.cpp
===
--- test/CodeGenCXX/runtime-dllstorage.cpp
+++ test/CodeGenCXX/runtime-dllstorage.cpp
@@ -1,13 +1,13 @@
 // RUN: %clang_cc1 -triple i686-windows-msvc -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -emit-llvm -o - %s | FileCheck 
%s -check-prefix CHECK-MS -check-prefix CHECK-MS-DYNAMIC
 // RUN: %clang_cc1 -triple i686-windows-msvc -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-MS -check-prefix 
CHECK-MS-STATIC
 
-// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -emit-llvm -o - %s | FileCheck 
%s -check-prefix CHECK-IA -check-prefix CHECK-DYNAMIC-IA -check-prefix 
CHECK-DYNAMIC-NODECL-IA -check-prefix CHECK-DYANMIC-IA-CXA-ATEXIT
+// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -emit-llvm -o - %s | FileCheck 
%s -check-prefix CHECK-IA -check-prefix CHECK-DYNAMIC-IA -check-prefix 
CHECK-DYNAMIC-NODECL-IA -check-prefix CHECK-DYNAMIC-IA-CXA-ATEXIT
 // RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-IA -check-prefix 
CHECK-STATIC-IA -check-prefix CHECK-STATIC-NODECL-IA -check-prefix 
CHECK-IA-STATIC-CXA-ATEXIT
-// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DIMPORT_DECLARATIONS 
-emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-IA -check-prefix 
CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-IMPORT-IA -check-prefix 
CHECK-DYANMIC-IA-CXA-ATEXIT
+// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DIMPORT_DECLARATIONS 
-emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-IA -check-prefix 
CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-IMPORT-IA -check-prefix 
CHECK-DYNAMIC-IA-CXA-ATEXIT
 // RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-DIMPORT_DECLARATIONS -emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-IA 
-check-prefix CHECK-STATIC-IA -check-prefix CHECK-STATIC-IMPORT-IA 
-check-prefix CHECK-IA-STATIC-CXA-ATEXIT
-// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DEXPORT_DECLARATIONS 
-emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-IA -check-prefix 
CHECK-DYNAMIC-IA -check-prefix CHECK-DYANMIC-IA-CXA-ATEXIT
+// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DEXPORT_DECLARATIONS 
-emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-IA -check-prefix 
CHECK-DYNAMIC-IA -check-prefix CHECK-DYNAMIC-IA-CXA-ATEXIT
 // RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-DEXPORT_DECLARATIONS -emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-IA 
-check-prefix CHECK-STATIC-IA -check-prefix CHECK-IA-STATIC-CXA-ATEXIT
-// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DDECL -emit-llvm -o - %s | 
FileCheck %s -check-prefix CHECK-IA -check-prefix CHECK-DYNAMIC-IA 
-check-prefix CHECK-DYNAMIC-DECL-IA -check-prefix CHECK-DYANMIC-IA-CXA-ATEXIT
+// RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -DDECL -emit-llvm -o - %s | 
FileCheck %s -check-prefix CHECK-IA -check-prefix CHECK-DYNAMIC-IA 
-check-prefix CHECK-DYNAMIC-DECL-IA -check-prefix CHECK-DYNAMIC-IA-CXA-ATEXIT
 // RUN: %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -flto-visibility-public-std 
-DDECL -emit-llvm -o - %s | FileCheck %s -check-prefix CHECK-IA -check-prefix 
CHECK-STATIC-IA -check-prefix CHECK-STATIC-DECL-IA -check-prefix 
CHECK-IA-STATIC-CXA-ATEXIT
 // %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -fno-use-cxa-atexit -emit-llvm 
-o - %s | FileCheck %s -check-prefix CHECK-IA -check-prefix CHECK-DYNAMIC-IA 
-check-prefix CHECK-DYNAMIC-IA-ATEXIT
 // %clang_cc1 -triple i686-windows-itanium -std=c++11 -fdeclspec 
-fms-compatibility -fexceptions -fcxx-exceptions -fno-use-cxa-atexit -emit-llvm 
-o - %s | FileCheck %s -check-prefix CHECK-IA -check-prefix CHECK-STA

[PATCH] D48715: [X86] Fix some vector cmp builtins - TRUE/FALSE predicates

2018-06-28 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: test/CodeGen/avx-builtins.c:1423
-
-__m256 test_mm256_cmp_ps_true(__m256 a, __m256 b) {
-  // CHECK-LABEL: @test_mm256_cmp_ps_true

spatel wrote:
> GBuella wrote:
> > spatel wrote:
> > > Why are we deleting tests instead of replacing the CHECK lines with the 
> > > new output?
> > These cases were only added, when the TRUE/FALSE related special cases were 
> > added in `CGBuiltin.cpp`. Now that the special cases are removed from 
> > `CGBuiltin.cpp`, it seemed consistent to also remove the related special 
> > cases from the tests, as these are not special anymore.
> > Should we have tests with all predicates?
> Yes. IIUC, we would have caught the bug before it was committed if we had 
> proper tests for each predicate and C intrinsic.
That is true!
I'll try to write a script that generates all cases, but not today.


Repository:
  rC Clang

https://reviews.llvm.org/D48715



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48715: [X86] Fix some vector cmp builtins - TRUE/FALSE predicates

2018-06-28 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: test/CodeGen/avx-builtins.c:1423
-
-__m256 test_mm256_cmp_ps_true(__m256 a, __m256 b) {
-  // CHECK-LABEL: @test_mm256_cmp_ps_true

spatel wrote:
> Why are we deleting tests instead of replacing the CHECK lines with the new 
> output?
These cases were only added, when the TRUE/FALSE related special cases were 
added in `CGBuiltin.cpp`. Now that the special cases are removed from 
`CGBuiltin.cpp`, it seemed consistent to also remove the related special cases 
from the tests, as these are not special anymore.
Should we have tests with all predicates?


Repository:
  rC Clang

https://reviews.llvm.org/D48715



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48715: [X86] Fix some vector cmp builtins - TRUE/FALSE predicates

2018-06-28 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, 
spatel, scanon, efriedma.
Herald added a subscriber: cfe-commits.

This patch removes on optimization used with the TRUE/FALSE
predicates, as was suggested in https://reviews.llvm.org/D45616
for r335339.
The optimization was buggy, since r335339 used it also
for *_mask builtins, without actually applying the mask -- the
mask argument was just ignored.


Repository:
  rC Clang

https://reviews.llvm.org/D48715

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: test/CodeGen/avx512vl-builtins.c
===
--- test/CodeGen/avx512vl-builtins.c
+++ test/CodeGen/avx512vl-builtins.c
@@ -1077,34 +1077,6 @@
   return (__mmask8)_mm256_cmp_ps_mask(__A, __B, 0);
 }
 
-__mmask8 test_mm256_cmp_ps_mask_true_uq(__m256 __A, __m256 __B) {
-  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_uq
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
-  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
-}
-
-__mmask8 test_mm256_cmp_ps_mask_true_us(__m256 __A, __m256 __B) {
-  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_us
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
-  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
-}
-
-__mmask8 test_mm256_cmp_ps_mask_false_oq(__m256 __A, __m256 __B) {
-  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_oq
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
-  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
-}
-
-__mmask8 test_mm256_cmp_ps_mask_false_os(__m256 __A, __m256 __B) {
-  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_os
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
-  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
-}
-
 __mmask8 test_mm256_mask_cmp_ps_mask(__mmask8 m, __m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_mask_cmp_ps_mask
   // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
@@ -1118,34 +1090,6 @@
   return (__mmask8)_mm_cmp_ps_mask(__A, __B, 0);
 }
 
-__mmask8 test_mm_cmp_ps_mask_true_uq(__m128 __A, __m128 __B) {
-  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_uq
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
-  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
-}
-
-__mmask8 test_mm_cmp_ps_mask_true_us(__m128 __A, __m128 __B) {
-  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_us
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
-  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
-}
-
-__mmask8 test_mm_cmp_ps_mask_false_oq(__m128 __A, __m128 __B) {
-  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_oq
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
-  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
-}
-
-__mmask8 test_mm_cmp_ps_mask_false_os(__m128 __A, __m128 __B) {
-  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_os
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
-  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
-}
-
 __mmask8 test_mm_mask_cmp_ps_mask(__mmask8 m, __m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_mask_cmp_ps_mask
   // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
@@ -1160,34 +1104,6 @@
   return (__mmask8)_mm256_cmp_pd_mask(__A, __B, 0);
 }
 
-__mmask8 test_mm256_cmp_pd_mask_true_uq(__m256d __A, __m256d __B) {
-  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_uq
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
-  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_UQ);
-}
-
-__mmask8 test_mm256_cmp_pd_mask_true_us(__m256d __A, __m256d __B) {
-  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_us
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
-  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_US);
-}
-
-__mmask8 test_mm256_cmp_pd_mask_false_oq(__m256d __A, __m256d __B) {
-  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_oq
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
-  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OQ);
-}
-
-__mmask8 test_mm256_cmp_pd_mask_false_os(__m256d __A, __m256d __B) {
-  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_os
-  // CHECK-NOT: call
-  // CHECK: ret i8 0
-  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OS);
-}
-
 __mmask8 test_mm256_mask_cmp_pd_mask(__mmask8 m, __m256d __A, __m256d __B) {
   // CHECK-LABEL: @test_mm256_mask_cmp_pd_mask
   // CHECK: fcmp oeq <4 x double> %{{.*}}, %{{.*}}
@@ -1202,34 +1118,6 @@
   return (__mmask8)_mm_cmp_pd_mask(__A, __B, 0);
 }
 
-__mmask8 test_mm_cmp_pd_mask_true_uq(__m128d __A, __m128d __B) {
-  // CHECK-LABEL: @test_mm_cmp_pd_mask_true_uq
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
-  return (__mmask8)_mm_cmp_pd_mask(__A, __B, _CMP_TRUE_UQ);
-}
-
-__mmask8 test_mm_cmp_pd_mask_true_us(__m128d __A, __m128d __B) {
-  // CHECK-LABEL: @test_mm_cmp_pd_mask_true_us
-  // CHECK-NOT: call
-  // CHECK: ret i8 -1
-  return (__mmask8)_mm_cmp_pd_mask(__A, __B, _CMP_TRUE_US);
-}
-
-__mmask8 test_mm_cmp_pd_mask_false_oq(__m128d __A, __m128d __B) {
-  // CHECK-LABEL: @test_mm_cmp_pd_mas

[PATCH] D48708: NFC Build fix in RegisterCustomCheckersTest.cpp

2018-06-28 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added reviewers: alexfh, george.karpenkov.
Herald added a subscriber: cfe-commits.

`ninja-build check-clang` failed using GCC 4.8.5 with:

  unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp:64:12: error: could 
not convert ‘AnalysisConsumer’ from 
‘std::unique_ptr’ to 
‘std::unique_ptr’
   return AnalysisConsumer;
  ^


Repository:
  rC Clang

https://reviews.llvm.org/D48708

Files:
  unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp


Index: unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp
===
--- unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp
+++ unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp
@@ -61,7 +61,7 @@
 AnalysisConsumer->AddCheckerRegistrationFn([](CheckerRegistry &Registry) {
   Registry.addChecker("custom.CustomChecker", 
"Description");
 });
-return AnalysisConsumer;
+return std::unique_ptr(AnalysisConsumer.release());
   }
 };
 


Index: unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp
===
--- unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp
+++ unittests/StaticAnalyzer/RegisterCustomCheckersTest.cpp
@@ -61,7 +61,7 @@
 AnalysisConsumer->AddCheckerRegistrationFn([](CheckerRegistry &Registry) {
   Registry.addChecker("custom.CustomChecker", "Description");
 });
-return AnalysisConsumer;
+return std::unique_ptr(AnalysisConsumer.release());
   }
 };
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48487: [X86][AVX512] Lowering _mm512_[max|min]_p[s|d] to native IR

2018-06-22 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, 
spatel, scanon, efriedma.
Herald added a subscriber: cfe-commits.

Repository:
  rC Clang

https://reviews.llvm.org/D48487

Files:
  lib/Headers/avx512fintrin.h
  test/CodeGen/avx512f-builtins.c

Index: test/CodeGen/avx512f-builtins.c
===
--- test/CodeGen/avx512f-builtins.c
+++ test/CodeGen/avx512f-builtins.c
@@ -8426,23 +8426,28 @@
 __m512d test_mm512_mask_max_pd (__m512d __W, __mmask8 __U, __m512d __A, __m512d __B)
 {
   // CHECK-LABEL: @test_mm512_mask_max_pd 
-  // CHECK: @llvm.x86.avx512.max.pd.512
+  // CHECK: fcmp uge <8 x double> %{{.*}}, %{{.*}}
+  // CHECK: select <8 x i1> %{{.*}}, <8 x double> %{{.*}}, <8 x double> %{{.*}}
   // CHECK: select <8 x i1> %{{.*}}, <8 x double> %{{.*}}, <8 x double> %{{.*}}
   return _mm512_mask_max_pd (__W,__U,__A,__B);
 }
 
 __m512d test_mm512_maskz_max_pd (__mmask8 __U, __m512d __A, __m512d __B)
 {
   // CHECK-LABEL: @test_mm512_maskz_max_pd 
-  // CHECK: @llvm.x86.avx512.max.pd.512
+  // CHECK: fcmp uge <8 x double> %{{.*}}, %{{.*}}
+  // CHECK: select <8 x i1> %{{.*}}, <8 x double> %{{.*}}, <8 x double> %{{.*}}
+  // CHECK: store <8 x double> zeroinitializer, <8 x double>* %.compoundliteral.i.i, align 64
+  // CHECK: load <8 x double>, <8 x double>* %.compoundliteral.i.i, align 64
   // CHECK: select <8 x i1> %{{.*}}, <8 x double> %{{.*}}, <8 x double> %{{.*}}
   return _mm512_maskz_max_pd (__U,__A,__B);
 }
 
 __m512 test_mm512_mask_max_ps (__m512 __W, __mmask16 __U, __m512 __A, __m512 __B)
 {
   // CHECK-LABEL: @test_mm512_mask_max_ps 
-  // CHECK: @llvm.x86.avx512.max.ps.512
+  // CHECK: fcmp uge <16 x float> %{{.*}}, %{{.*}}
+  // CHECK: select <16 x i1> %{{.*}}, <16 x float> %{{.*}}, <16 x float> %{{.*}}
   // CHECK: select <16 x i1> %{{.*}}, <16 x float> %{{.*}}, <16 x float> %{{.*}}
   return _mm512_mask_max_ps (__W,__U,__A,__B);
 }
@@ -8473,7 +8478,10 @@
 __m512 test_mm512_maskz_max_ps (__mmask16 __U, __m512 __A, __m512 __B)
 {
   // CHECK-LABEL: @test_mm512_maskz_max_ps 
-  // CHECK: @llvm.x86.avx512.max.ps.512
+  // CHECK: fcmp uge <16 x float> %{{.*}}, %{{.*}}
+  // CHECK: select <16 x i1> %{{.*}}, <16 x float> %{{.*}}, <16 x float> %{{.*}}
+  // CHECK: store <16 x float> zeroinitializer, <16 x float>* %.compoundliteral.i.i, align 64
+  // CHECK: load <16 x float>, <16 x float>* %.compoundliteral.i.i, align 64
   // CHECK: select <16 x i1> %{{.*}}, <16 x float> %{{.*}}, <16 x float> %{{.*}}
   return _mm512_maskz_max_ps (__U,__A,__B);
 }
@@ -8504,15 +8512,20 @@
 __m512d test_mm512_mask_min_pd (__m512d __W, __mmask8 __U, __m512d __A, __m512d __B)
 {
   // CHECK-LABEL: @test_mm512_mask_min_pd 
-  // CHECK: @llvm.x86.avx512.min.pd.512
+  // CHECK: fcmp olt <8 x double> %{{.*}}, %{{.*}}
+  // CHECK: select <8 x i1> %{{.*}}, <8 x double> %{{.*}}, <8 x double> %{{.*}}
   // CHECK: select <8 x i1> %{{.*}}, <8 x double> %{{.*}}, <8 x double> %{{.*}}
   return _mm512_mask_min_pd (__W,__U,__A,__B);
 }
 
 __m512d test_mm512_maskz_min_pd (__mmask8 __U, __m512d __A, __m512d __B)
 {
   // CHECK-LABEL: @test_mm512_maskz_min_pd 
-  // CHECK: @llvm.x86.avx512.min.pd.512
+  // CHECK: fcmp olt <8 x double> %{{.*}}, %{{.*}}
+  // CHECK: select <8 x i1> %{{.*}}, <8 x double> %{{.*}}, <8 x double> %{{.*}}
+  // CHECK: store <8 x double> zeroinitializer, <8 x double>* %.compoundliteral.i.i, align 64
+  // CHECK: load <8 x double>, <8 x double>* %.compoundliteral.i.i, align 64
+  // CHECK: select <8 x i1> %{{.*}}, <8 x double> %{{.*}}, <8 x double> %{{.*}}
   return _mm512_maskz_min_pd (__U,__A,__B);
 }
 
@@ -8542,15 +8555,19 @@
 __m512 test_mm512_mask_min_ps (__m512 __W, __mmask16 __U, __m512 __A, __m512 __B)
 {
   // CHECK-LABEL: @test_mm512_mask_min_ps 
-  // CHECK: @llvm.x86.avx512.min.ps.512
+  // CHECK: fcmp olt <16 x float> %{{.*}}, %{{.*}}
+  // CHECK: select <16 x i1> %{{.*}}, <16 x float> %{{.*}}, <16 x float> %{{.*}}
   // CHECK: select <16 x i1> %{{.*}}, <16 x float> %{{.*}}, <16 x float> %{{.*}}
   return _mm512_mask_min_ps (__W,__U,__A,__B);
 }
 
 __m512 test_mm512_maskz_min_ps (__mmask16 __U, __m512 __A, __m512 __B)
 {
   // CHECK-LABEL: @test_mm512_maskz_min_ps 
-  // CHECK: @llvm.x86.avx512.min.ps.512
+  // CHECK: fcmp olt <16 x float> %{{.*}}, %{{.*}}
+  // CHECK: select <16 x i1> %{{.*}}, <16 x float> %{{.*}}, <16 x float> %{{.*}}
+  // CHECK: store <16 x float> zeroinitializer, <16 x float>* %.compoundliteral.i.i, align 64
+  // CHECK: load <16 x float>, <16 x float>* %.compoundliteral.i.i, align 64
   // CHECK: select <16 x i1> %{{.*}}, <16 x float> %{{.*}}, <16 x float> %{{.*}}
   return _mm512_maskz_min_ps (__U,__A,__B);
 }
Index: lib/Headers/avx512fintrin.h
===
--- lib/Headers/avx512fintrin.h
+++ lib/Headers/avx512fintrin.h
@@ -818,6 +818,118 @@
   return (__m512i)((__v8du)__a ^ (__v8du)__b);
 }
 
+/* Compare */
+
+#define

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-22 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rC335339: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to 
native llvm IR (authored by GBuella, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D45616?vs=152060&id=152456#toc

Repository:
  rC Clang

https://reviews.llvm.org/D45616

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx-cmp-builtins.c
  test/CodeGen/avx2-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: lib/CodeGen/CGBuiltin.cpp
===
--- lib/CodeGen/CGBuiltin.cpp
+++ lib/CodeGen/CGBuiltin.cpp
@@ -10120,44 +10120,7 @@
 return Builder.CreateExtractValue(Call, 1);
   }
 
-  case X86::BI__builtin_ia32_cmpps128_mask:
-  case X86::BI__builtin_ia32_cmpps256_mask:
-  case X86::BI__builtin_ia32_cmpps512_mask:
-  case X86::BI__builtin_ia32_cmppd128_mask:
-  case X86::BI__builtin_ia32_cmppd256_mask:
-  case X86::BI__builtin_ia32_cmppd512_mask: {
-unsigned NumElts = Ops[0]->getType()->getVectorNumElements();
-Value *MaskIn = Ops[3];
-Ops.erase(&Ops[3]);
-
-Intrinsic::ID ID;
-switch (BuiltinID) {
-default: llvm_unreachable("Unsupported intrinsic!");
-case X86::BI__builtin_ia32_cmpps128_mask:
-  ID = Intrinsic::x86_avx512_mask_cmp_ps_128;
-  break;
-case X86::BI__builtin_ia32_cmpps256_mask:
-  ID = Intrinsic::x86_avx512_mask_cmp_ps_256;
-  break;
-case X86::BI__builtin_ia32_cmpps512_mask:
-  ID = Intrinsic::x86_avx512_mask_cmp_ps_512;
-  break;
-case X86::BI__builtin_ia32_cmppd128_mask:
-  ID = Intrinsic::x86_avx512_mask_cmp_pd_128;
-  break;
-case X86::BI__builtin_ia32_cmppd256_mask:
-  ID = Intrinsic::x86_avx512_mask_cmp_pd_256;
-  break;
-case X86::BI__builtin_ia32_cmppd512_mask:
-  ID = Intrinsic::x86_avx512_mask_cmp_pd_512;
-  break;
-}
-
-Value *Cmp = Builder.CreateCall(CGM.getIntrinsic(ID), Ops);
-return EmitX86MaskedCompareResult(*this, Cmp, NumElts, MaskIn);
-  }
-
-  // SSE packed comparison intrinsics
+  // packed comparison intrinsics
   case X86::BI__builtin_ia32_cmpeqps:
   case X86::BI__builtin_ia32_cmpeqpd:
 return getVectorFCmpIR(CmpInst::FCMP_OEQ);
@@ -10185,64 +10148,84 @@
   case X86::BI__builtin_ia32_cmpps:
   case X86::BI__builtin_ia32_cmpps256:
   case X86::BI__builtin_ia32_cmppd:
-  case X86::BI__builtin_ia32_cmppd256: {
+  case X86::BI__builtin_ia32_cmppd256:
+  case X86::BI__builtin_ia32_cmpps128_mask:
+  case X86::BI__builtin_ia32_cmpps256_mask:
+  case X86::BI__builtin_ia32_cmpps512_mask:
+  case X86::BI__builtin_ia32_cmppd128_mask:
+  case X86::BI__builtin_ia32_cmppd256_mask:
+  case X86::BI__builtin_ia32_cmppd512_mask: {
+// Lowering vector comparisons to fcmp instructions, while
+// ignoring signalling behaviour requested
+// ignoring rounding mode requested
+// This is is only possible as long as FENV_ACCESS is not implemented.
+// See also: https://reviews.llvm.org/D45616
+
+// The third argument is the comparison condition, and integer in the
+// range [0, 31]
 unsigned CC = cast(Ops[2])->getZExtValue() & 0x1f;
-// If this one of the SSE immediates, we can use native IR.
-if (CC < 8) {
-  FCmpInst::Predicate Pred;
-  switch (CC) {
-  case 0: Pred = FCmpInst::FCMP_OEQ; break;
-  case 1: Pred = FCmpInst::FCMP_OLT; break;
-  case 2: Pred = FCmpInst::FCMP_OLE; break;
-  case 3: Pred = FCmpInst::FCMP_UNO; break;
-  case 4: Pred = FCmpInst::FCMP_UNE; break;
-  case 5: Pred = FCmpInst::FCMP_UGE; break;
-  case 6: Pred = FCmpInst::FCMP_UGT; break;
-  case 7: Pred = FCmpInst::FCMP_ORD; break;
-  }
-  return getVectorFCmpIR(Pred);
+
+// Lowering to IR fcmp instruction.
+// Ignoring requested signaling behaviour,
+// e.g. both _CMP_GT_OS & _CMP_GT_OQ are translated to FCMP_OGT.
+FCmpInst::Predicate Pred;
+switch (CC) {
+case 0x00: Pred = FCmpInst::FCMP_OEQ; break;
+case 0x01: Pred = FCmpInst::FCMP_OLT; break;
+case 0x02: Pred = FCmpInst::FCMP_OLE; break;
+case 0x03: Pred = FCmpInst::FCMP_UNO; break;
+case 0x04: Pred = FCmpInst::FCMP_UNE; break;
+case 0x05: Pred = FCmpInst::FCMP_UGE; break;
+case 0x06: Pred = FCmpInst::FCMP_UGT; break;
+case 0x07: Pred = FCmpInst::FCMP_ORD; break;
+case 0x08: Pred = FCmpInst::FCMP_UEQ; break;
+case 0x09: Pred = FCmpInst::FCMP_ULT; break;
+case 0x0a: Pred = FCmpInst::FCMP_ULE; break;
+case 0x0c: Pred = FCmpInst::FCMP_ONE; break;
+case 0x0d: Pred = FCmpInst::FCMP_OGE; break;
+case 0x0e: Pred = FCmpInst::FCMP_OGT; break;
+case 0x10: Pred = FCmpInst::FCMP_OEQ; break;
+case 0x11: Pred = FCmpInst::FCMP_OLT; break;
+case 0x12: Pred = FCmpInst::FCMP_OLE; break;
+case 0x13: Pred = FCmpInst::FCMP_UNO; break;
+case 0x14: Pred = FCmpInst::FCMP_UNE; break;
+case 0x15: Pred

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-20 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

I was overzealous with the intrinsics, I lower really only the packed 
comparison now.


https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-20 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 152060.
GBuella edited the summary of this revision.

https://reviews.llvm.org/D45616

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx-cmp-builtins.c
  test/CodeGen/avx2-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: test/CodeGen/avx512vl-builtins.c
===
--- test/CodeGen/avx512vl-builtins.c
+++ test/CodeGen/avx512vl-builtins.c
@@ -1073,53 +1073,168 @@
 
 __mmask8 test_mm256_cmp_ps_mask(__m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_cmp_ps_mask
-  // CHECK: call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_ps_mask_true_uq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_true_us(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_oq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_os(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm256_mask_cmp_ps_mask(__mmask8 m, __m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
-  // CHECK: and <8 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> %{{.*}}, %{{.*}}
   return _mm256_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm_cmp_ps_mask(__m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_cmp_ps_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm_cmp_ps_mask_true_uq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_true_us(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_oq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_os(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm_mask_cmp_ps_mask(__mmask8 m, __m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
-  // CHECK: and <4 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
+  // CHECK: shufflevector <8 x i1> %{{.*}}, <8 x i1> %{{.*}}, <4 x i32> 
+  // CHECK: and <4 x i1> %{{.*}}, %{{.*}}
   return _mm_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm256_cmp_pd_mask(__m256d __A, __m256d __B) {
   // CHECK-LABEL: @test_mm256_cmp_pd_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.pd.256
+  // CHECK: fcmp oeq <4 x double> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_pd_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_pd_mask_true_uq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_true_us(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_oq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_os(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm256_mask_cmp_pd_mask(__mmask

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-19 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 151884.
GBuella added a comment.

Added `__builtin_ia32_cmpsd_mask` & `__builtin_ia32_cmpss_mask`.


https://reviews.llvm.org/D45616

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx-cmp-builtins.c
  test/CodeGen/avx2-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: test/CodeGen/avx512vl-builtins.c
===
--- test/CodeGen/avx512vl-builtins.c
+++ test/CodeGen/avx512vl-builtins.c
@@ -1073,53 +1073,168 @@
 
 __mmask8 test_mm256_cmp_ps_mask(__m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_cmp_ps_mask
-  // CHECK: call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_ps_mask_true_uq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_true_us(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_oq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_os(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm256_mask_cmp_ps_mask(__mmask8 m, __m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
-  // CHECK: and <8 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> %{{.*}}, %{{.*}}
   return _mm256_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm_cmp_ps_mask(__m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_cmp_ps_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm_cmp_ps_mask_true_uq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_true_us(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_oq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_os(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm_mask_cmp_ps_mask(__mmask8 m, __m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
-  // CHECK: and <4 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
+  // CHECK: shufflevector <8 x i1> %{{.*}}, <8 x i1> %{{.*}}, <4 x i32> 
+  // CHECK: and <4 x i1> %{{.*}}, %{{.*}}
   return _mm_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm256_cmp_pd_mask(__m256d __A, __m256d __B) {
   // CHECK-LABEL: @test_mm256_cmp_pd_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.pd.256
+  // CHECK: fcmp oeq <4 x double> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_pd_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_pd_mask_true_uq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_true_us(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_oq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_os(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OS);
+}

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-18 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 151710.

https://reviews.llvm.org/D45616

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx-cmp-builtins.c
  test/CodeGen/avx2-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: test/CodeGen/avx512vl-builtins.c
===
--- test/CodeGen/avx512vl-builtins.c
+++ test/CodeGen/avx512vl-builtins.c
@@ -1073,53 +1073,168 @@
 
 __mmask8 test_mm256_cmp_ps_mask(__m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_cmp_ps_mask
-  // CHECK: call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_ps_mask_true_uq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_true_us(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_oq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_os(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm256_mask_cmp_ps_mask(__mmask8 m, __m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
-  // CHECK: and <8 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> %{{.*}}, %{{.*}}
   return _mm256_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm_cmp_ps_mask(__m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_cmp_ps_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm_cmp_ps_mask_true_uq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_true_us(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_oq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_os(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm_mask_cmp_ps_mask(__mmask8 m, __m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
-  // CHECK: and <4 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
+  // CHECK: shufflevector <8 x i1> %{{.*}}, <8 x i1> %{{.*}}, <4 x i32> 
+  // CHECK: and <4 x i1> %{{.*}}, %{{.*}}
   return _mm_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm256_cmp_pd_mask(__m256d __A, __m256d __B) {
   // CHECK-LABEL: @test_mm256_cmp_pd_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.pd.256
+  // CHECK: fcmp oeq <4 x double> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_pd_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_pd_mask_true_uq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_true_us(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_oq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_os(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm256_mask_cmp_pd_mask(__mmask8 m, __m256d __A, __m256d __B) {
   // CHECK-

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-18 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

The question still left is, should we remove, auto upgrade the LLVM intrinsics 
not used anymore, or keep them around for when the signal behaviour is going to 
matter?


https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-14 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/CodeGen/CGBuiltin.cpp:10107-10112
+case 0x0b: // FALSE_OQ
+case 0x1b: // FALSE_OS
+  return llvm::Constant::getNullValue(ConvertType(E->getType()));
+case 0x0f: // TRUE_UQ
+case 0x1f: // TRUE_US
+  return llvm::Constant::getAllOnesValue(ConvertType(E->getType()));

spatel wrote:
> On 2nd thought, why are we optimizing when we have matching IR predicates for 
> these?
> Just translate to FCMP_TRUE / FCMP_FALSE instead of special-casing these 
> values.
> InstSimplify can handle the constant folding if optimization is on.
I don't know, these TRUE/FALSE cases were already handled here, I only 
rearranged the code.
Does this cause any problems? I mean, if it meant an extra dozen lines of code 
I would get it, but as it is, does it hurt anything?


https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-13 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 151203.
GBuella edited the summary of this revision.
GBuella added a comment.

Ignoring signaling behviour - and rounding mode with it.
Also lowering `__builtin_ia32_cmpsd` and `__builtin_ia32_cmpss`.


https://reviews.llvm.org/D45616

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx-cmp-builtins.c
  test/CodeGen/avx2-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: test/CodeGen/avx512vl-builtins.c
===
--- test/CodeGen/avx512vl-builtins.c
+++ test/CodeGen/avx512vl-builtins.c
@@ -1073,53 +1073,168 @@
 
 __mmask8 test_mm256_cmp_ps_mask(__m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_cmp_ps_mask
-  // CHECK: call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_ps_mask_true_uq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_true_us(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_oq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_os(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm256_mask_cmp_ps_mask(__mmask8 m, __m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
-  // CHECK: and <8 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> %{{.*}}, %{{.*}}
   return _mm256_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm_cmp_ps_mask(__m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_cmp_ps_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm_cmp_ps_mask_true_uq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_true_us(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_oq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_os(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm_mask_cmp_ps_mask(__mmask8 m, __m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
-  // CHECK: and <4 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
+  // CHECK: shufflevector <8 x i1> %{{.*}}, <8 x i1> %{{.*}}, <4 x i32> 
+  // CHECK: and <4 x i1> %{{.*}}, %{{.*}}
   return _mm_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm256_cmp_pd_mask(__m256d __A, __m256d __B) {
   // CHECK-LABEL: @test_mm256_cmp_pd_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.pd.256
+  // CHECK: fcmp oeq <4 x double> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_pd_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_pd_mask_true_uq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_true_us(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_oq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_os(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_os
+  // CHEC

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-13 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/CodeGen/CGBuiltin.cpp:10090-10100
+// _CMP_TRUE_UQ, _CMP_TRUE_US produce -1,-1... vector
+// on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produce 0, 0...
+if (CC == 0xf || CC == 0xb || CC == 0x1b || CC == 0x1f) {
+   llvm::Type *ResultType = ConvertType(E->getType());
+
+   Value *Constant = (CC == 0xf || CC == 0x1f) ?
+  llvm::Constant::getAllOnesValue(ResultType) :

spatel wrote:
>   llvm::Type *ResultType = ConvertType(E->getType());
>   if (CC == 0x0f || CC == 0x1f)
> return llvm::Constant::getAllOnesValue(ResultType);
>   if (CC == 0x0b || CC == 0x1b)
> return llvm::Constant::getNullValue(ResultType);
> 
> ?
> 
> Also, can we use the defined predicate names instead of hex values in this 
> code?
Well, I believe we would need to predefine them first.
I only found those in the client header `avxintrin.h`.


https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-12 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/CodeGen/CGBuiltin.cpp:10071
+  // is _MM_FROUND_CUR_DIRECTION
+  if (cast(Ops[4])->getZExtValue() != 4)
+UsesNonDefaultRounding = true;

efriedma wrote:
> Given we're ignoring floating-point exceptions, we should also ignore the 
> "rounding mode" operand (__MM_FROUND_NO_EXC only affects exceptions, and the 
> other values are irrelevant because there isn't any actual rounding involved).
Oh, alltight. @craig.topper , Do you also agree on ignoring all of these, and 
just lowering all the comparisions to fcmp?
That is the easiest path of course.


https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-11 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

Ping @efriedma


https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-06-11 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 150767.
GBuella added a comment.

I altered the code, to ignore the the signaling behaviour, as suggested.
Also, it handles more vector cmp builtins now.


https://reviews.llvm.org/D45616

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx-cmp-builtins.c
  test/CodeGen/avx2-builtins.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins.c

Index: test/CodeGen/avx512vl-builtins.c
===
--- test/CodeGen/avx512vl-builtins.c
+++ test/CodeGen/avx512vl-builtins.c
@@ -1073,53 +1073,168 @@
 
 __mmask8 test_mm256_cmp_ps_mask(__m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_cmp_ps_mask
-  // CHECK: call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_ps_mask_true_uq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_true_us(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_oq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_os(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm256_mask_cmp_ps_mask(__mmask8 m, __m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
-  // CHECK: and <8 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <8 x float> %{{.*}}, %{{.*}}
+  // CHECK: and <8 x i1> %{{.*}}, %{{.*}}
   return _mm256_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm_cmp_ps_mask(__m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_cmp_ps_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
   return (__mmask8)_mm_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm_cmp_ps_mask_true_uq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_true_us(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_oq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_os(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm_mask_cmp_ps_mask(__mmask8 m, __m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_mask_cmp_ps_mask
-  // CHECK: [[CMP:%.*]] = call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
-  // CHECK: and <4 x i1> [[CMP]], {{.*}}
+  // CHECK: fcmp oeq <4 x float> %{{.*}}, %{{.*}}
+  // CHECK: shufflevector <8 x i1> %{{.*}}, <8 x i1> %{{.*}}, <4 x i32> 
+  // CHECK: and <4 x i1> %{{.*}}, %{{.*}}
   return _mm_mask_cmp_ps_mask(m, __A, __B, 0);
 }
 
 __mmask8 test_mm256_cmp_pd_mask(__m256d __A, __m256d __B) {
   // CHECK-LABEL: @test_mm256_cmp_pd_mask
-  // CHECK: call <4 x i1> @llvm.x86.avx512.mask.cmp.pd.256
+  // CHECK: fcmp oeq <4 x double> %{{.*}}, %{{.*}}
   return (__mmask8)_mm256_cmp_pd_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_pd_mask_true_uq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_true_us(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: ret i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_oq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_os(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: ret i8 0
+  return (__

[PATCH] D47912: [CMake] Consider LLVM_APPEND_VC_REV when generating SVNVersion.inc

2018-06-07 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D47912#1125803, @smeenai wrote:

> I believe LLVM_APPEND_VC_REV controls whether the revision is appended to 
> LLVM version string (e.g. the output of `llvm-config --version`), whereas the 
> SVNRevision.inc file (which is what's causing the relink after amending) is 
> used for e.g. the `clang --version` output, so I'm not sure this is the right 
> thing to do. I would also love for an option to disable the SVNRevision stuff 
> entirely though.


Right, that is why I added the word "quickfix".
Perhaps we can add a `CLANG_APPEND_VC_REV` CMake option?


Repository:
  rC Clang

https://reviews.llvm.org/D47912



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47912: [CMake] Consider LLVM_APPEND_VC_REV when generating SVNVersion.inc

2018-06-07 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added reviewers: beanz, bogner, hintonda.
Herald added subscribers: cfe-commits, mgorny.

This is merely a quickfix, due to being fustrated with waiting
for ~70 objects to be linked each time following `git commit --amend`.
When toggling LLVM_APPEND_VC_REV between ON and OFF, the
version inc file is still updated. Still, this is arguable
an improvement.


Repository:
  rC Clang

https://reviews.llvm.org/D47912

Files:
  lib/Basic/CMakeLists.txt


Index: lib/Basic/CMakeLists.txt
===
--- lib/Basic/CMakeLists.txt
+++ lib/Basic/CMakeLists.txt
@@ -12,7 +12,7 @@
 
 set(get_svn_script "${LLVM_CMAKE_PATH}/GetSVN.cmake")
 
-if(DEFINED llvm_vc AND DEFINED clang_vc)
+if(DEFINED llvm_vc AND DEFINED clang_vc AND LLVM_APPEND_VC_REV)
   # Create custom target to generate the VC revision include.
   add_custom_command(OUTPUT "${version_inc}"
 DEPENDS "${llvm_vc}" "${clang_vc}" "${get_svn_script}"


Index: lib/Basic/CMakeLists.txt
===
--- lib/Basic/CMakeLists.txt
+++ lib/Basic/CMakeLists.txt
@@ -12,7 +12,7 @@
 
 set(get_svn_script "${LLVM_CMAKE_PATH}/GetSVN.cmake")
 
-if(DEFINED llvm_vc AND DEFINED clang_vc)
+if(DEFINED llvm_vc AND DEFINED clang_vc AND LLVM_APPEND_VC_REV)
   # Create custom target to generate the VC revision include.
   add_custom_command(OUTPUT "${version_inc}"
 DEPENDS "${llvm_vc}" "${clang_vc}" "${get_svn_script}"
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-06-07 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL334174: [CodeGen] Improve diagnostics related to target 
attributes (authored by GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D46541?vs=150018&id=150277#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D46541

Files:
  cfe/trunk/lib/CodeGen/CodeGenFunction.cpp
  cfe/trunk/lib/CodeGen/CodeGenModule.cpp
  cfe/trunk/lib/CodeGen/CodeGenModule.h
  cfe/trunk/test/CodeGen/target-features-error-2.c
  cfe/trunk/test/CodeGen/target-features-error.c

Index: cfe/trunk/test/CodeGen/target-features-error-2.c
===
--- cfe/trunk/test/CodeGen/target-features-error-2.c
+++ cfe/trunk/test/CodeGen/target-features-error-2.c
@@ -1,38 +1,46 @@
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_SSE42
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_1
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX512f
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -target-feature +movdir64b -S -verify -o - -D NEED_MOVDIRI
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -target-feature +avx512vnni -target-feature +movdiri -S -verify -o - -D NEED_CLWB
 
 #define __MM_MALLOC_H
 #include 
 
-#if NEED_SSE42
+#if NEED_AVX_1
 int baz(__m256i a) {
   return _mm256_extract_epi32(a, 3); // expected-error {{'__builtin_ia32_vec_ext_v8si' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_1
+#if NEED_AVX_2
 __m128 need_avx(__m128 a, __m128 b) {
   return _mm_cmp_ps(a, b, 0); // expected-error {{'__builtin_ia32_cmpps' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_2
-__m128 need_avx(__m128 a, __m128 b) {
-  return _mm_cmp_ss(a, b, 0); // expected-error {{'__builtin_ia32_cmpss' needs target feature avx}}
+#if NEED_AVX512f
+unsigned short need_avx512f(unsigned short a, unsigned short b) {
+  return __builtin_ia32_korhi(a, b); // expected-error {{'__builtin_ia32_korhi' needs target feature avx512f}}
 }
 #endif
 
-#if NEED_AVX_3
-__m128d need_avx(__m128d a, __m128d b) {
-  return _mm_cmp_pd(a, b, 0); // expected-error {{'__builtin_ia32_cmppd' needs target feature avx}}
+#if NEED_MOVDIRI
+void need_movdiri(unsigned int *a, unsigned int b) {
+  __builtin_ia32_directstore_u32(a, b); // expected-error {{'__builtin_ia32_directstore_u32' needs target feature movdiri}}
 }
 #endif
 
-#if NEED_AVX_4
-__m128d need_avx(__m128d a, __m128d b) {
-  return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs target feature avx}}
+#if NEED_CLWB
+static __inline__ void
+ __attribute__((__always_inline__, __nodebug__,  __target__("avx512vnni,clwb,movdiri,movdir64b")))
+ func(unsigned int *a, unsigned int b)
+{
+  __builtin_ia32_directstore_u32(a, b);
+}
+
+void need_clwb(unsigned int *a, unsigned int b) {
+  func(a, b); // expected-error {{always_inline function 'func' requires target feature 'clwb', but would be inlined into function 'need_clwb' that is compiled without support for 'clwb'}}
+
 }
 #endif
Index: cfe/trunk/test/CodeGen/target-features-error.c
===
--- cfe/trunk/test/CodeGen/target-features-error.c
+++ cfe/trunk/test/CodeGen/target-features-error.c
@@ -3,6 +3,5 @@
   return a + 4;
 }
 int bar() {
-  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'sse4.2', but would be inlined into function 'bar' that is compiled without support for 'sse4.2'}}
+  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'avx', but would be inlined into function 'bar' that is compiled without support for 'avx'}}
 }
-
Index: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
===
--- cfe/trunk/lib/CodeGen/CodeGenModule.cpp
+++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp
@@ -5033,22 +5033,28 @@
   }
 }
 
+TargetAttr::ParsedTargetAttr CodeGenModule::filterFunctionTargetAttrs(const TargetAttr *TD) {
+  assert(TD != nullptr);
+  TargetAttr::ParsedTargetAttr ParsedAttr = TD->parse();
+
+  ParsedAttr.Features.erase(
+  llvm::remove_if(ParsedAttr.Features,
+  [&](const std::string &Feat) {
+return !Target.isValidFeatureName(
+StringRef{Feat}.substr(1));
+  }),
+  ParsedAttr.Features.end());
+  return ParsedAttr;
+}
+
+
 // Fills in the supplied string map with the set of target features for the
 // passed in function.
 void CodeGenModule::getFunctionFeatureMap(llvm::StringMap &FeatureMap,
   const FunctionDecl *FD) {

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-06-05 Thread Gabor Buella via Phabricator via cfe-commits

GBuella marked an inline comment as done.
GBuella added inline comments.



Comment at: test/CodeGen/target-features-error-2.c:39
 __m128d need_avx(__m128d a, __m128d b) {
   return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs 
target feature avx}}
 }

craig.topper wrote:
> GBuella wrote:
> > craig.topper wrote:
> > > The 4 compare functions here are basically the same for the purpose of 
> > > this check. What value do we get by testing all 4 of them?
> > Which four more specifically?
> _mm_cmp_ps, _mm_cmp_ss, _mm_cmp_pd, _mm_cmp_sd. Or the cases for 
> NEED_AVX_2-5. They're all basically checking the same thing. There is nothing 
> different about the command lines other than the -D NEED_AVX
Yes, I guess we could just remove NEED_AVX_3,4,5


https://reviews.llvm.org/D46541



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-06-05 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 150018.

https://reviews.llvm.org/D46541

Files:
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenModule.h
  test/CodeGen/target-features-error-2.c
  test/CodeGen/target-features-error.c

Index: test/CodeGen/target-features-error.c
===
--- test/CodeGen/target-features-error.c
+++ test/CodeGen/target-features-error.c
@@ -3,6 +3,5 @@
   return a + 4;
 }
 int bar() {
-  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'sse4.2', but would be inlined into function 'bar' that is compiled without support for 'sse4.2'}}
+  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'avx', but would be inlined into function 'bar' that is compiled without support for 'avx'}}
 }
-
Index: test/CodeGen/target-features-error-2.c
===
--- test/CodeGen/target-features-error-2.c
+++ test/CodeGen/target-features-error-2.c
@@ -1,38 +1,46 @@
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_SSE42
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_1
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX512f
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -target-feature +movdir64b -S -verify -o - -D NEED_MOVDIRI
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -target-feature +avx512vnni -target-feature +movdiri -S -verify -o - -D NEED_CLWB
 
 #define __MM_MALLOC_H
 #include 
 
-#if NEED_SSE42
+#if NEED_AVX_1
 int baz(__m256i a) {
-  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'sse4.2', but would be inlined into function 'baz' that is compiled without support for 'sse4.2'}}
+  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'avx', but would be inlined into function 'baz' that is compiled without support for 'avx'}}
 }
 #endif
 
-#if NEED_AVX_1
+#if NEED_AVX_2
 __m128 need_avx(__m128 a, __m128 b) {
   return _mm_cmp_ps(a, b, 0); // expected-error {{'__builtin_ia32_cmpps' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_2
-__m128 need_avx(__m128 a, __m128 b) {
-  return _mm_cmp_ss(a, b, 0); // expected-error {{'__builtin_ia32_cmpss' needs target feature avx}}
+#if NEED_AVX512f
+unsigned short need_avx512f(unsigned short a, unsigned short b) {
+  return __builtin_ia32_korhi(a, b); // expected-error {{'__builtin_ia32_korhi' needs target feature avx512f}}
 }
 #endif
 
-#if NEED_AVX_3
-__m128d need_avx(__m128d a, __m128d b) {
-  return _mm_cmp_pd(a, b, 0); // expected-error {{'__builtin_ia32_cmppd' needs target feature avx}}
+#if NEED_MOVDIRI
+void need_movdiri(unsigned int *a, unsigned int b) {
+  __builtin_ia32_directstore_u32(a, b); // expected-error {{'__builtin_ia32_directstore_u32' needs target feature movdiri}}
 }
 #endif
 
-#if NEED_AVX_4
-__m128d need_avx(__m128d a, __m128d b) {
-  return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs target feature avx}}
+#if NEED_CLWB
+static __inline__ void
+ __attribute__((__always_inline__, __nodebug__,  __target__("avx512vnni,clwb,movdiri,movdir64b")))
+ func(unsigned int *a, unsigned int b)
+{
+  __builtin_ia32_directstore_u32(a, b);
+}
+
+void need_clwb(unsigned int *a, unsigned int b) {
+  func(a, b); // expected-error {{always_inline function 'func' requires target feature 'clwb', but would be inlined into function 'need_clwb' that is compiled without support for 'clwb'}}
+
 }
 #endif
Index: lib/CodeGen/CodeGenModule.h
===
--- lib/CodeGen/CodeGenModule.h
+++ lib/CodeGen/CodeGenModule.h
@@ -1089,6 +1089,10 @@
   /// It's up to you to ensure that this is safe.
   void AddDefaultFnAttrs(llvm::Function &F);
 
+  /// Parses the target attributes passed in, and returns only the ones that are
+  /// valid feature names.
+  TargetAttr::ParsedTargetAttr filterFunctionTargetAttrs(const TargetAttr *TD);
+
   // Fills in the supplied string map with the set of target features for the
   // passed in function.
   void getFunctionFeatureMap(llvm::StringMap &FeatureMap,
Index: lib/CodeGen/CodeGenModule.cpp
===
--- lib/CodeGen/CodeGenModule.cpp
+++ lib/CodeGen/CodeGenModule.cpp
@@ -5033,22 +5033,28 @@
   }
 }
 
+TargetAttr::ParsedTargetAttr CodeGenModule::filterFunctionTargetAttrs(const TargetAttr *TD) {
+  assert(TD != nullptr);
+  TargetAttr::ParsedTargetAttr ParsedAttr = TD->parse();
+
+  ParsedAttr.Features.erase(
+  llv

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-06-05 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: test/CodeGen/target-features-error-2.c:39
 __m128d need_avx(__m128d a, __m128d b) {
   return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs 
target feature avx}}
 }

craig.topper wrote:
> The 4 compare functions here are basically the same for the purpose of this 
> check. What value do we get by testing all 4 of them?
Which four more specifically?


https://reviews.llvm.org/D46541



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47142: [x86] invpcid intrinsic

2018-05-24 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL333256: [x86] invpcid intrinsic (authored by GBuella, 
committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D47142?vs=148341&id=148547#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D47142

Files:
  cfe/trunk/include/clang/Basic/BuiltinsX86.def
  cfe/trunk/include/clang/Driver/Options.td
  cfe/trunk/lib/Basic/Targets/X86.cpp
  cfe/trunk/lib/Basic/Targets/X86.h
  cfe/trunk/lib/Headers/CMakeLists.txt
  cfe/trunk/lib/Headers/cpuid.h
  cfe/trunk/lib/Headers/immintrin.h
  cfe/trunk/lib/Headers/invpcidintrin.h
  cfe/trunk/lib/Headers/module.modulemap
  cfe/trunk/test/CodeGen/invpcid.c
  cfe/trunk/test/Driver/x86-target-features.c
  cfe/trunk/test/Preprocessor/predefined-arch-macros.c

Index: cfe/trunk/include/clang/Driver/Options.td
===
--- cfe/trunk/include/clang/Driver/Options.td
+++ cfe/trunk/include/clang/Driver/Options.td
@@ -2685,6 +2685,8 @@
 def mno_fsgsbase : Flag<["-"], "mno-fsgsbase">, Group;
 def mfxsr : Flag<["-"], "mfxsr">, Group;
 def mno_fxsr : Flag<["-"], "mno-fxsr">, Group;
+def minvpcid : Flag<["-"], "minvpcid">, Group;
+def mno_invpcid : Flag<["-"], "mno-invpcid">, Group;
 def mgfni : Flag<["-"], "mgfni">, Group;
 def mno_gfni : Flag<["-"], "mno-gfni">, Group;
 def mlwp : Flag<["-"], "mlwp">, Group;
Index: cfe/trunk/include/clang/Basic/BuiltinsX86.def
===
--- cfe/trunk/include/clang/Basic/BuiltinsX86.def
+++ cfe/trunk/include/clang/Basic/BuiltinsX86.def
@@ -1867,6 +1867,9 @@
 // PTWRITE
 TARGET_BUILTIN(__builtin_ia32_ptwrite32, "vUi", "n", "ptwrite")
 
+// INVPCID
+TARGET_BUILTIN(__builtin_ia32_invpcid, "vUiv*", "nc", "invpcid")
+
 // MSVC
 TARGET_HEADER_BUILTIN(_BitScanForward, "UcUNi*UNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
 TARGET_HEADER_BUILTIN(_BitScanReverse, "UcUNi*UNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
Index: cfe/trunk/test/Driver/x86-target-features.c
===
--- cfe/trunk/test/Driver/x86-target-features.c
+++ cfe/trunk/test/Driver/x86-target-features.c
@@ -164,3 +164,8 @@
 // RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-ptwrite %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PTWRITE %s
 // PTWRITE: "-target-feature" "+ptwrite"
 // NO-PTWRITE: "-target-feature" "-ptwrite"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -minvpcid %s -### -o %t.o 2>&1 | FileCheck -check-prefix=INVPCID %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-invpcid %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-INVPCID %s
+// INVPCID: "-target-feature" "+invpcid"
+// NO-INVPCID: "-target-feature" "-invpcid"
Index: cfe/trunk/test/Preprocessor/predefined-arch-macros.c
===
--- cfe/trunk/test/Preprocessor/predefined-arch-macros.c
+++ cfe/trunk/test/Preprocessor/predefined-arch-macros.c
@@ -526,6 +526,7 @@
 // CHECK_CORE_AVX2_M32: #define __BMI__ 1
 // CHECK_CORE_AVX2_M32: #define __F16C__ 1
 // CHECK_CORE_AVX2_M32: #define __FMA__ 1
+// CHECK_CORE_AVX2_M32: #define __INVPCID__ 1
 // CHECK_CORE_AVX2_M32: #define __LZCNT__ 1
 // CHECK_CORE_AVX2_M32: #define __MMX__ 1
 // CHECK_CORE_AVX2_M32: #define __PCLMUL__ 1
@@ -556,6 +557,7 @@
 // CHECK_CORE_AVX2_M64: #define __BMI__ 1
 // CHECK_CORE_AVX2_M64: #define __F16C__ 1
 // CHECK_CORE_AVX2_M64: #define __FMA__ 1
+// CHECK_CORE_AVX2_M64: #define __INVPCID__ 1
 // CHECK_CORE_AVX2_M64: #define __LZCNT__ 1
 // CHECK_CORE_AVX2_M64: #define __MMX__ 1
 // CHECK_CORE_AVX2_M64: #define __PCLMUL__ 1
@@ -590,6 +592,7 @@
 // CHECK_BROADWELL_M32: #define __BMI__ 1
 // CHECK_BROADWELL_M32: #define __F16C__ 1
 // CHECK_BROADWELL_M32: #define __FMA__ 1
+// CHECK_BROADWELL_M32: #define __INVPCID__ 1
 // CHECK_BROADWELL_M32: #define __LZCNT__ 1
 // CHECK_BROADWELL_M32: #define __MMX__ 1
 // CHECK_BROADWELL_M32: #define __PCLMUL__ 1
@@ -623,6 +626,7 @@
 // CHECK_BROADWELL_M64: #define __BMI__ 1
 // CHECK_BROADWELL_M64: #define __F16C__ 1
 // CHECK_BROADWELL_M64: #define __FMA__ 1
+// CHECK_BROADWELL_M64: #define __INVPCID__ 1
 // CHECK_BROADWELL_M64: #define __LZCNT__ 1
 // CHECK_BROADWELL_M64: #define __MMX__ 1
 // CHECK_BROADWELL_M64: #define __PCLMUL__ 1
@@ -660,6 +664,7 @@
 // CHECK_SKL_M32: #define __CLFLUSHOPT__ 1
 // CHECK_SKL_M32: #define __F16C__ 1
 // CHECK_SKL_M32: #define __FMA__ 1
+// CHECK_SKL_M32: #define __INVPCID__ 1
 // CHECK_SKL_M32: #define __LZCNT__ 1
 // CHECK_SKL_M32: #define __MMX__ 1
 // CHECK_SKL_M32: #define __MPX__ 1
@@ -694,6 +699,7 @@
 // CHECK_SKL_M64: #define __CLFLUSHOPT__ 1
 // CHECK_SKL_M64: #define __F16C__ 1
 // CHECK_SKL_M64: #define __FMA__ 1
+// CHECK_SKL_M64: #define __INVPCID__ 1
 // CHECK_SKL_M64: #define __LZCNT__ 1
 // CHECK_SKL_M64: #define __MMX__ 1
 // CHECK_SKL_M64: #define __M

[PATCH] D47142: [x86] invpcid intrinsic

2018-05-23 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 148341.
GBuella added a comment.

Relocated header inclusion from x86intrin.h -> immintrin.h
I gave up on the MSVC intrinsic idea, it wouldn't work in a way compatible with 
MSVC as long as LLVM requires the feature to be enabled anyways. If once we 
decide to remove such feature flags for such system programming instructions, 
we can get back this.


https://reviews.llvm.org/D47142

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/immintrin.h
  lib/Headers/invpcidintrin.h
  lib/Headers/module.modulemap
  test/CodeGen/invpcid.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -526,6 +526,7 @@
 // CHECK_CORE_AVX2_M32: #define __BMI__ 1
 // CHECK_CORE_AVX2_M32: #define __F16C__ 1
 // CHECK_CORE_AVX2_M32: #define __FMA__ 1
+// CHECK_CORE_AVX2_M32: #define __INVPCID__ 1
 // CHECK_CORE_AVX2_M32: #define __LZCNT__ 1
 // CHECK_CORE_AVX2_M32: #define __MMX__ 1
 // CHECK_CORE_AVX2_M32: #define __PCLMUL__ 1
@@ -556,6 +557,7 @@
 // CHECK_CORE_AVX2_M64: #define __BMI__ 1
 // CHECK_CORE_AVX2_M64: #define __F16C__ 1
 // CHECK_CORE_AVX2_M64: #define __FMA__ 1
+// CHECK_CORE_AVX2_M64: #define __INVPCID__ 1
 // CHECK_CORE_AVX2_M64: #define __LZCNT__ 1
 // CHECK_CORE_AVX2_M64: #define __MMX__ 1
 // CHECK_CORE_AVX2_M64: #define __PCLMUL__ 1
@@ -590,6 +592,7 @@
 // CHECK_BROADWELL_M32: #define __BMI__ 1
 // CHECK_BROADWELL_M32: #define __F16C__ 1
 // CHECK_BROADWELL_M32: #define __FMA__ 1
+// CHECK_BROADWELL_M32: #define __INVPCID__ 1
 // CHECK_BROADWELL_M32: #define __LZCNT__ 1
 // CHECK_BROADWELL_M32: #define __MMX__ 1
 // CHECK_BROADWELL_M32: #define __PCLMUL__ 1
@@ -623,6 +626,7 @@
 // CHECK_BROADWELL_M64: #define __BMI__ 1
 // CHECK_BROADWELL_M64: #define __F16C__ 1
 // CHECK_BROADWELL_M64: #define __FMA__ 1
+// CHECK_BROADWELL_M64: #define __INVPCID__ 1
 // CHECK_BROADWELL_M64: #define __LZCNT__ 1
 // CHECK_BROADWELL_M64: #define __MMX__ 1
 // CHECK_BROADWELL_M64: #define __PCLMUL__ 1
@@ -660,6 +664,7 @@
 // CHECK_SKL_M32: #define __CLFLUSHOPT__ 1
 // CHECK_SKL_M32: #define __F16C__ 1
 // CHECK_SKL_M32: #define __FMA__ 1
+// CHECK_SKL_M32: #define __INVPCID__ 1
 // CHECK_SKL_M32: #define __LZCNT__ 1
 // CHECK_SKL_M32: #define __MMX__ 1
 // CHECK_SKL_M32: #define __MPX__ 1
@@ -694,6 +699,7 @@
 // CHECK_SKL_M64: #define __CLFLUSHOPT__ 1
 // CHECK_SKL_M64: #define __F16C__ 1
 // CHECK_SKL_M64: #define __FMA__ 1
+// CHECK_SKL_M64: #define __INVPCID__ 1
 // CHECK_SKL_M64: #define __LZCNT__ 1
 // CHECK_SKL_M64: #define __MMX__ 1
 // CHECK_SKL_M64: #define __MPX__ 1
@@ -888,6 +894,7 @@
 // CHECK_SKX_M32: #define __CLWB__ 1
 // CHECK_SKX_M32: #define __F16C__ 1
 // CHECK_SKX_M32: #define __FMA__ 1
+// CHECK_SKX_M32: #define __INVPCID__ 1
 // CHECK_SKX_M32: #define __LZCNT__ 1
 // CHECK_SKX_M32: #define __MMX__ 1
 // CHECK_SKX_M32: #define __MPX__ 1
@@ -933,6 +940,7 @@
 // CHECK_SKX_M64: #define __CLWB__ 1
 // CHECK_SKX_M64: #define __F16C__ 1
 // CHECK_SKX_M64: #define __FMA__ 1
+// CHECK_SKX_M64: #define __INVPCID__ 1
 // CHECK_SKX_M64: #define __LZCNT__ 1
 // CHECK_SKX_M64: #define __MMX__ 1
 // CHECK_SKX_M64: #define __MPX__ 1
@@ -983,6 +991,7 @@
 // CHECK_CNL_M32-NOT: #define __CLWB__ 1
 // CHECK_CNL_M32: #define __F16C__ 1
 // CHECK_CNL_M32: #define __FMA__ 1
+// CHECK_CNL_M32: #define __INVPCID__ 1
 // CHECK_CNL_M32: #define __LZCNT__ 1
 // CHECK_CNL_M32: #define __MMX__ 1
 // CHECK_CNL_M32: #define __MPX__ 1
@@ -1031,6 +1040,7 @@
 // CHECK_CNL_M64-NOT: #define __CLWB__ 1
 // CHECK_CNL_M64: #define __F16C__ 1
 // CHECK_CNL_M64: #define __FMA__ 1
+// CHECK_CNL_M64: #define __INVPCID__ 1
 // CHECK_CNL_M64: #define __LZCNT__ 1
 // CHECK_CNL_M64: #define __MMX__ 1
 // CHECK_CNL_M64: #define __MPX__ 1
@@ -1085,6 +1095,7 @@
 // CHECK_ICL_M32: #define __F16C__ 1
 // CHECK_ICL_M32: #define __FMA__ 1
 // CHECK_ICL_M32: #define __GFNI__ 1
+// CHECK_ICL_M32: #define __INVPCID__ 1
 // CHECK_ICL_M32: #define __LZCNT__ 1
 // CHECK_ICL_M32: #define __MMX__ 1
 // CHECK_ICL_M32: #define __MPX__ 1
@@ -1142,6 +1153,7 @@
 // CHECK_ICL_M64: #define __F16C__ 1
 // CHECK_ICL_M64: #define __FMA__ 1
 // CHECK_ICL_M64: #define __GFNI__ 1
+// CHECK_ICL_M64: #define __INVPCID__ 1
 // CHECK_ICL_M64: #define __LZCNT__ 1
 // CHECK_ICL_M64: #define __MMX__ 1
 // CHECK_ICL_M64: #define __MPX__ 1
@@ -1200,6 +1212,7 @@
 // CHECK_ICX_M32: #define __F16C__ 1
 // CHECK_ICX_M32: #define __FMA__ 1
 // CHECK_ICX_M32: #define __GFNI__ 1
+// CHECK_ICX_M32: #define __INVPCID__ 1
 // CHECK_ICX_M32: #define __LZCNT__ 1
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
@@ -1258,6 +1271,7 @@
 // CHECK_ICX_M64: #define

[PATCH] D47142: [x86] invpcid intrinsic

2018-05-22 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 148011.

https://reviews.llvm.org/D47142

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/invpcidintrin.h
  lib/Headers/module.modulemap
  lib/Headers/x86intrin.h
  test/CodeGen/invpcid.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -526,6 +526,7 @@
 // CHECK_CORE_AVX2_M32: #define __BMI__ 1
 // CHECK_CORE_AVX2_M32: #define __F16C__ 1
 // CHECK_CORE_AVX2_M32: #define __FMA__ 1
+// CHECK_CORE_AVX2_M32: #define __INVPCID__ 1
 // CHECK_CORE_AVX2_M32: #define __LZCNT__ 1
 // CHECK_CORE_AVX2_M32: #define __MMX__ 1
 // CHECK_CORE_AVX2_M32: #define __PCLMUL__ 1
@@ -556,6 +557,7 @@
 // CHECK_CORE_AVX2_M64: #define __BMI__ 1
 // CHECK_CORE_AVX2_M64: #define __F16C__ 1
 // CHECK_CORE_AVX2_M64: #define __FMA__ 1
+// CHECK_CORE_AVX2_M64: #define __INVPCID__ 1
 // CHECK_CORE_AVX2_M64: #define __LZCNT__ 1
 // CHECK_CORE_AVX2_M64: #define __MMX__ 1
 // CHECK_CORE_AVX2_M64: #define __PCLMUL__ 1
@@ -590,6 +592,7 @@
 // CHECK_BROADWELL_M32: #define __BMI__ 1
 // CHECK_BROADWELL_M32: #define __F16C__ 1
 // CHECK_BROADWELL_M32: #define __FMA__ 1
+// CHECK_BROADWELL_M32: #define __INVPCID__ 1
 // CHECK_BROADWELL_M32: #define __LZCNT__ 1
 // CHECK_BROADWELL_M32: #define __MMX__ 1
 // CHECK_BROADWELL_M32: #define __PCLMUL__ 1
@@ -623,6 +626,7 @@
 // CHECK_BROADWELL_M64: #define __BMI__ 1
 // CHECK_BROADWELL_M64: #define __F16C__ 1
 // CHECK_BROADWELL_M64: #define __FMA__ 1
+// CHECK_BROADWELL_M64: #define __INVPCID__ 1
 // CHECK_BROADWELL_M64: #define __LZCNT__ 1
 // CHECK_BROADWELL_M64: #define __MMX__ 1
 // CHECK_BROADWELL_M64: #define __PCLMUL__ 1
@@ -660,6 +664,7 @@
 // CHECK_SKL_M32: #define __CLFLUSHOPT__ 1
 // CHECK_SKL_M32: #define __F16C__ 1
 // CHECK_SKL_M32: #define __FMA__ 1
+// CHECK_SKL_M32: #define __INVPCID__ 1
 // CHECK_SKL_M32: #define __LZCNT__ 1
 // CHECK_SKL_M32: #define __MMX__ 1
 // CHECK_SKL_M32: #define __MPX__ 1
@@ -694,6 +699,7 @@
 // CHECK_SKL_M64: #define __CLFLUSHOPT__ 1
 // CHECK_SKL_M64: #define __F16C__ 1
 // CHECK_SKL_M64: #define __FMA__ 1
+// CHECK_SKL_M64: #define __INVPCID__ 1
 // CHECK_SKL_M64: #define __LZCNT__ 1
 // CHECK_SKL_M64: #define __MMX__ 1
 // CHECK_SKL_M64: #define __MPX__ 1
@@ -888,6 +894,7 @@
 // CHECK_SKX_M32: #define __CLWB__ 1
 // CHECK_SKX_M32: #define __F16C__ 1
 // CHECK_SKX_M32: #define __FMA__ 1
+// CHECK_SKX_M32: #define __INVPCID__ 1
 // CHECK_SKX_M32: #define __LZCNT__ 1
 // CHECK_SKX_M32: #define __MMX__ 1
 // CHECK_SKX_M32: #define __MPX__ 1
@@ -933,6 +940,7 @@
 // CHECK_SKX_M64: #define __CLWB__ 1
 // CHECK_SKX_M64: #define __F16C__ 1
 // CHECK_SKX_M64: #define __FMA__ 1
+// CHECK_SKX_M64: #define __INVPCID__ 1
 // CHECK_SKX_M64: #define __LZCNT__ 1
 // CHECK_SKX_M64: #define __MMX__ 1
 // CHECK_SKX_M64: #define __MPX__ 1
@@ -983,6 +991,7 @@
 // CHECK_CNL_M32-NOT: #define __CLWB__ 1
 // CHECK_CNL_M32: #define __F16C__ 1
 // CHECK_CNL_M32: #define __FMA__ 1
+// CHECK_CNL_M32: #define __INVPCID__ 1
 // CHECK_CNL_M32: #define __LZCNT__ 1
 // CHECK_CNL_M32: #define __MMX__ 1
 // CHECK_CNL_M32: #define __MPX__ 1
@@ -1031,6 +1040,7 @@
 // CHECK_CNL_M64-NOT: #define __CLWB__ 1
 // CHECK_CNL_M64: #define __F16C__ 1
 // CHECK_CNL_M64: #define __FMA__ 1
+// CHECK_CNL_M64: #define __INVPCID__ 1
 // CHECK_CNL_M64: #define __LZCNT__ 1
 // CHECK_CNL_M64: #define __MMX__ 1
 // CHECK_CNL_M64: #define __MPX__ 1
@@ -1085,6 +1095,7 @@
 // CHECK_ICL_M32: #define __F16C__ 1
 // CHECK_ICL_M32: #define __FMA__ 1
 // CHECK_ICL_M32: #define __GFNI__ 1
+// CHECK_ICL_M32: #define __INVPCID__ 1
 // CHECK_ICL_M32: #define __LZCNT__ 1
 // CHECK_ICL_M32: #define __MMX__ 1
 // CHECK_ICL_M32: #define __MPX__ 1
@@ -1142,6 +1153,7 @@
 // CHECK_ICL_M64: #define __F16C__ 1
 // CHECK_ICL_M64: #define __FMA__ 1
 // CHECK_ICL_M64: #define __GFNI__ 1
+// CHECK_ICL_M64: #define __INVPCID__ 1
 // CHECK_ICL_M64: #define __LZCNT__ 1
 // CHECK_ICL_M64: #define __MMX__ 1
 // CHECK_ICL_M64: #define __MPX__ 1
@@ -1200,6 +1212,7 @@
 // CHECK_ICX_M32: #define __F16C__ 1
 // CHECK_ICX_M32: #define __FMA__ 1
 // CHECK_ICX_M32: #define __GFNI__ 1
+// CHECK_ICX_M32: #define __INVPCID__ 1
 // CHECK_ICX_M32: #define __LZCNT__ 1
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
@@ -1258,6 +1271,7 @@
 // CHECK_ICX_M64: #define __F16C__ 1
 // CHECK_ICX_M64: #define __FMA__ 1
 // CHECK_ICX_M64: #define __GFNI__ 1
+// CHECK_ICX_M64: #define __INVPCID__ 1
 // CHECK_ICX_M64: #define __LZCNT__ 1
 // CHECK_ICX_M64: #define __MMX__ 1
 // CHECK_ICX_M64: #define __MPX__ 1
Index: test/Driver/x86-target-features.c

[PATCH] D47182: [X86] Move all Intel defined intrinsic includes into immintrin.h

2018-05-22 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

What does "imm" mean anyways?


Repository:
  rC Clang

https://reviews.llvm.org/D47182



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-05-22 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D46541#1106743, @craig.topper wrote:

> I think you can pass StringRef(F).substr(1). That won't create a temporary 
> string. It will just create a StringRef pointing into the middle of an 
> existing std::string stored in the parsed attributes.


Yes, that seems to work.


https://reviews.llvm.org/D46541



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-05-22 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 147984.

https://reviews.llvm.org/D46541

Files:
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenModule.h
  test/CodeGen/target-features-error-2.c
  test/CodeGen/target-features-error.c

Index: test/CodeGen/target-features-error.c
===
--- test/CodeGen/target-features-error.c
+++ test/CodeGen/target-features-error.c
@@ -3,6 +3,5 @@
   return a + 4;
 }
 int bar() {
-  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'sse4.2', but would be inlined into function 'bar' that is compiled without support for 'sse4.2'}}
+  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'avx', but would be inlined into function 'bar' that is compiled without support for 'avx'}}
 }
-
Index: test/CodeGen/target-features-error-2.c
===
--- test/CodeGen/target-features-error-2.c
+++ test/CodeGen/target-features-error-2.c
@@ -1,38 +1,67 @@
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_SSE42
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_1
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_5
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX512f
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -target-feature +movdir64b -S -verify -o - -D NEED_MOVDIRI
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -target-feature +avx512vnni -target-feature +movdiri -S -verify -o - -D NEED_CLWB
 
 #define __MM_MALLOC_H
 #include 
 
-#if NEED_SSE42
+#if NEED_AVX_1
 int baz(__m256i a) {
-  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'sse4.2', but would be inlined into function 'baz' that is compiled without support for 'sse4.2'}}
+  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'avx', but would be inlined into function 'baz' that is compiled without support for 'avx'}}
 }
 #endif
 
-#if NEED_AVX_1
+#if NEED_AVX_2
 __m128 need_avx(__m128 a, __m128 b) {
   return _mm_cmp_ps(a, b, 0); // expected-error {{'__builtin_ia32_cmpps' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_2
+#if NEED_AVX_3
 __m128 need_avx(__m128 a, __m128 b) {
   return _mm_cmp_ss(a, b, 0); // expected-error {{'__builtin_ia32_cmpss' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_3
+#if NEED_AVX_4
 __m128d need_avx(__m128d a, __m128d b) {
   return _mm_cmp_pd(a, b, 0); // expected-error {{'__builtin_ia32_cmppd' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_4
+#if NEED_AVX_5
 __m128d need_avx(__m128d a, __m128d b) {
   return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs target feature avx}}
 }
 #endif
+
+#if NEED_AVX512f
+unsigned short need_avx512f(unsigned short a, unsigned short b) {
+  return __builtin_ia32_korhi(a, b); // expected-error {{'__builtin_ia32_korhi' needs target feature avx512f}}
+}
+#endif
+
+#if NEED_MOVDIRI
+void need_movdiri(unsigned int *a, unsigned int b) {
+  __builtin_ia32_directstore_u32(a, b); // expected-error {{'__builtin_ia32_directstore_u32' needs target feature movdiri}}
+}
+#endif
+
+#if NEED_CLWB
+static __inline__ void
+ __attribute__((__always_inline__, __nodebug__,  __target__("avx512vnni,clwb,movdiri,movdir64b")))
+ func(unsigned int *a, unsigned int b)
+{
+  __builtin_ia32_directstore_u32(a, b);
+}
+
+void need_clwb(unsigned int *a, unsigned int b) {
+  func(a, b); // expected-error {{always_inline function 'func' requires target feature 'clwb', but would be inlined into function 'need_clwb' that is compiled without support for 'clwb'}}
+
+}
+#endif
Index: lib/CodeGen/CodeGenModule.h
===
--- lib/CodeGen/CodeGenModule.h
+++ lib/CodeGen/CodeGenModule.h
@@ -1089,6 +1089,10 @@
   /// It's up to you to ensure that this is safe.
   void AddDefaultFnAttrs(llvm::Function &F);
 
+  /// Parses the target attributes passed in, and returns only the ones that are
+  /// valid feature names.
+  TargetAttr::ParsedTargetAttr filterFunctionTargetAttrs(const TargetAttr *TD);
+
   // Fills in the supplied string map with the set of target features for the
   // passed in function.
   void getFunctionFeatureMap(llvm::StringMap &FeatureMap,
Index: lib/CodeGen/CodeGenModule.cpp
===
--- lib/CodeGen/CodeGenModule.cpp
+++ lib/CodeGen/CodeGenModule.cpp
@@ -5032,22 +5032,28 @@
   }
 }
 
+TargetAttr::ParsedTargetAttr CodeGenModule::filterFunct

[PATCH] D46052: GNUstep Objective-C ABI version 2

2018-05-22 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/CodeGen/CGObjCGNU.cpp:1056
+char c = Str[i];
+if (isalpha(c) || isnumber(c))
+  StringName += c;

GBuella wrote:
> theraven wrote:
> > Ka-Ka wrote:
> > > The isnumber() function was added to cctype.h by Apple. I don't think it 
> > > can be used in llvm.
> > > 
> > > According to
> > > https://stackoverflow.com/questions/39204080/what-is-the-difference-between-isdigit-and-isnumber
> > > 
> > Ah, isnumber is from 4.4BSD, I assumed it worked everywhere.  Changing it 
> > to isdigit is fine.
> BTW `isalnum` is in ISO since 1989.
Also, you might want to `isalnum((unsigned char)c)`, as passing a negative 
value to these functions is UB.


Repository:
  rC Clang

https://reviews.llvm.org/D46052



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46052: GNUstep Objective-C ABI version 2

2018-05-22 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/CodeGen/CGObjCGNU.cpp:1056
+char c = Str[i];
+if (isalpha(c) || isnumber(c))
+  StringName += c;

theraven wrote:
> Ka-Ka wrote:
> > The isnumber() function was added to cctype.h by Apple. I don't think it 
> > can be used in llvm.
> > 
> > According to
> > https://stackoverflow.com/questions/39204080/what-is-the-difference-between-isdigit-and-isnumber
> > 
> Ah, isnumber is from 4.4BSD, I assumed it worked everywhere.  Changing it to 
> isdigit is fine.
BTW `isalnum` is in ISO since 1989.


Repository:
  rC Clang

https://reviews.llvm.org/D46052



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47142: [x86] invpcid intrinsic

2018-05-21 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/Headers/intrin.h:196
+ */
 void __cdecl _invpcid(unsigned int, void *);
+#endif

rnk wrote:
> GBuella wrote:
> > GBuella wrote:
> > > rnk wrote:
> > > > craig.topper wrote:
> > > > > @rnk, what's the right thing to do here?
> > > > What problems does this redeclaration cause?
> > > Now that I think about it, none.
> > Also, I think this could be added as TARGET_HEADER_BUILTIN..."intrin.h", 
> > ALL_MS_LANGUAGES into BuiltinsX86.def
> > And removed from here.
> > Right?
> It could be, but then you'd need to implement it as a builtin instead of 
> having invpcidintrin.h, right?
Meanwhile, I just realized, we add the `!defined(_MSC_VER)` condition around 
the inclusion of all such header files, as I did with `invpcidintrin.h`.
Does clang always define _MSC_VER on Windows? I mean, always in cases where one 
would want to include the MSVC likes`intrin.h`?
In that case, these are completely independent things. Also, I could just add 
both builtins `_invpcid` with the ALL_MS_LANGUAGES thing, and the 
`__builtin_ia32_invpcid` with the usual clang/GCC convention of adding separate 
feature flags (both would be lowered to the same LLVM intrinsic).
BTW, if we stop adding feature flags for such intrinsics, we probably have to 
convince GCC to do the same, as we love that compatibility.


Repository:
  rC Clang

https://reviews.llvm.org/D47142



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47142: [x86] invpcid intrinsic

2018-05-21 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/Headers/intrin.h:196
+ */
 void __cdecl _invpcid(unsigned int, void *);
+#endif

GBuella wrote:
> rnk wrote:
> > craig.topper wrote:
> > > @rnk, what's the right thing to do here?
> > What problems does this redeclaration cause?
> Now that I think about it, none.
Also, I think this could be added as TARGET_HEADER_BUILTIN..."intrin.h", 
ALL_MS_LANGUAGES into BuiltinsX86.def
And removed from here.
Right?


Repository:
  rC Clang

https://reviews.llvm.org/D47142



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47142: [x86] invpcid intrinsic

2018-05-21 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/Headers/intrin.h:196
+ */
 void __cdecl _invpcid(unsigned int, void *);
+#endif

rnk wrote:
> craig.topper wrote:
> > @rnk, what's the right thing to do here?
> What problems does this redeclaration cause?
Now that I think about it, none.


Repository:
  rC Clang

https://reviews.llvm.org/D47142



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47142: [x86] invpcid intrinsic

2018-05-21 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D47142#1106664, @rnk wrote:

> Why do we need a feature flag for this in the first place? The MSVC model for 
> most "instruction" intrinsics is that the exact instruction is emitted 
> regardless of the feature enabled. The target attribute seems like it would 
> get in the way of that.

I'm fine with getting rid of the feature flag, here, and then probably also for 
other instructions in similar situations (pconfig, wbnoinvd, clwb, etc...).
But as long as we keep those other feature flags, I guess we should just be 
consistent.

Repository:
  rC Clang

https://reviews.llvm.org/D47142

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47142: [x86] invpcid intrinsic

2018-05-21 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added a reviewer: craig.topper.
Herald added subscribers: cfe-commits, mgorny.

An intrinsic for an old instruction, as described in the Intel SDM.


Repository:
  rC Clang

https://reviews.llvm.org/D47142

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/intrin.h
  lib/Headers/invpcidintrin.h
  lib/Headers/module.modulemap
  lib/Headers/x86intrin.h
  test/CodeGen/invpcid.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -526,6 +526,7 @@
 // CHECK_CORE_AVX2_M32: #define __BMI__ 1
 // CHECK_CORE_AVX2_M32: #define __F16C__ 1
 // CHECK_CORE_AVX2_M32: #define __FMA__ 1
+// CHECK_CORE_AVX2_M32: #define __INVPCID__ 1
 // CHECK_CORE_AVX2_M32: #define __LZCNT__ 1
 // CHECK_CORE_AVX2_M32: #define __MMX__ 1
 // CHECK_CORE_AVX2_M32: #define __PCLMUL__ 1
@@ -556,6 +557,7 @@
 // CHECK_CORE_AVX2_M64: #define __BMI__ 1
 // CHECK_CORE_AVX2_M64: #define __F16C__ 1
 // CHECK_CORE_AVX2_M64: #define __FMA__ 1
+// CHECK_CORE_AVX2_M64: #define __INVPCID__ 1
 // CHECK_CORE_AVX2_M64: #define __LZCNT__ 1
 // CHECK_CORE_AVX2_M64: #define __MMX__ 1
 // CHECK_CORE_AVX2_M64: #define __PCLMUL__ 1
@@ -590,6 +592,7 @@
 // CHECK_BROADWELL_M32: #define __BMI__ 1
 // CHECK_BROADWELL_M32: #define __F16C__ 1
 // CHECK_BROADWELL_M32: #define __FMA__ 1
+// CHECK_BROADWELL_M32: #define __INVPCID__ 1
 // CHECK_BROADWELL_M32: #define __LZCNT__ 1
 // CHECK_BROADWELL_M32: #define __MMX__ 1
 // CHECK_BROADWELL_M32: #define __PCLMUL__ 1
@@ -623,6 +626,7 @@
 // CHECK_BROADWELL_M64: #define __BMI__ 1
 // CHECK_BROADWELL_M64: #define __F16C__ 1
 // CHECK_BROADWELL_M64: #define __FMA__ 1
+// CHECK_BROADWELL_M64: #define __INVPCID__ 1
 // CHECK_BROADWELL_M64: #define __LZCNT__ 1
 // CHECK_BROADWELL_M64: #define __MMX__ 1
 // CHECK_BROADWELL_M64: #define __PCLMUL__ 1
@@ -660,6 +664,7 @@
 // CHECK_SKL_M32: #define __CLFLUSHOPT__ 1
 // CHECK_SKL_M32: #define __F16C__ 1
 // CHECK_SKL_M32: #define __FMA__ 1
+// CHECK_SKL_M32: #define __INVPCID__ 1
 // CHECK_SKL_M32: #define __LZCNT__ 1
 // CHECK_SKL_M32: #define __MMX__ 1
 // CHECK_SKL_M32: #define __MPX__ 1
@@ -694,6 +699,7 @@
 // CHECK_SKL_M64: #define __CLFLUSHOPT__ 1
 // CHECK_SKL_M64: #define __F16C__ 1
 // CHECK_SKL_M64: #define __FMA__ 1
+// CHECK_SKL_M64: #define __INVPCID__ 1
 // CHECK_SKL_M64: #define __LZCNT__ 1
 // CHECK_SKL_M64: #define __MMX__ 1
 // CHECK_SKL_M64: #define __MPX__ 1
@@ -888,6 +894,7 @@
 // CHECK_SKX_M32: #define __CLWB__ 1
 // CHECK_SKX_M32: #define __F16C__ 1
 // CHECK_SKX_M32: #define __FMA__ 1
+// CHECK_SKX_M32: #define __INVPCID__ 1
 // CHECK_SKX_M32: #define __LZCNT__ 1
 // CHECK_SKX_M32: #define __MMX__ 1
 // CHECK_SKX_M32: #define __MPX__ 1
@@ -933,6 +940,7 @@
 // CHECK_SKX_M64: #define __CLWB__ 1
 // CHECK_SKX_M64: #define __F16C__ 1
 // CHECK_SKX_M64: #define __FMA__ 1
+// CHECK_SKX_M64: #define __INVPCID__ 1
 // CHECK_SKX_M64: #define __LZCNT__ 1
 // CHECK_SKX_M64: #define __MMX__ 1
 // CHECK_SKX_M64: #define __MPX__ 1
@@ -983,6 +991,7 @@
 // CHECK_CNL_M32-NOT: #define __CLWB__ 1
 // CHECK_CNL_M32: #define __F16C__ 1
 // CHECK_CNL_M32: #define __FMA__ 1
+// CHECK_CNL_M32: #define __INVPCID__ 1
 // CHECK_CNL_M32: #define __LZCNT__ 1
 // CHECK_CNL_M32: #define __MMX__ 1
 // CHECK_CNL_M32: #define __MPX__ 1
@@ -1031,6 +1040,7 @@
 // CHECK_CNL_M64-NOT: #define __CLWB__ 1
 // CHECK_CNL_M64: #define __F16C__ 1
 // CHECK_CNL_M64: #define __FMA__ 1
+// CHECK_CNL_M64: #define __INVPCID__ 1
 // CHECK_CNL_M64: #define __LZCNT__ 1
 // CHECK_CNL_M64: #define __MMX__ 1
 // CHECK_CNL_M64: #define __MPX__ 1
@@ -1085,6 +1095,7 @@
 // CHECK_ICL_M32: #define __F16C__ 1
 // CHECK_ICL_M32: #define __FMA__ 1
 // CHECK_ICL_M32: #define __GFNI__ 1
+// CHECK_ICL_M32: #define __INVPCID__ 1
 // CHECK_ICL_M32: #define __LZCNT__ 1
 // CHECK_ICL_M32: #define __MMX__ 1
 // CHECK_ICL_M32: #define __MPX__ 1
@@ -1142,6 +1153,7 @@
 // CHECK_ICL_M64: #define __F16C__ 1
 // CHECK_ICL_M64: #define __FMA__ 1
 // CHECK_ICL_M64: #define __GFNI__ 1
+// CHECK_ICL_M64: #define __INVPCID__ 1
 // CHECK_ICL_M64: #define __LZCNT__ 1
 // CHECK_ICL_M64: #define __MMX__ 1
 // CHECK_ICL_M64: #define __MPX__ 1
@@ -1200,6 +1212,7 @@
 // CHECK_ICX_M32: #define __F16C__ 1
 // CHECK_ICX_M32: #define __FMA__ 1
 // CHECK_ICX_M32: #define __GFNI__ 1
+// CHECK_ICX_M32: #define __INVPCID__ 1
 // CHECK_ICX_M32: #define __LZCNT__ 1
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
@@ -1258,6 +1271,7 @@
 // CHECK_ICX_M64: #define __F16C__ 1
 // CHECK_ICX_M64: #define __FMA__ 1
 // CHECK_ICX_M64: #define __GFNI__ 1
+// CHECK_ICX_M64: #define __INVPCID__ 1
 // CHECK_ICX_M64: #define __

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-05-21 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 147775.
GBuella added a comment.

Fixed a horrible bug in the patch. Adding a ref to temporary string is not a 
wise thing to do, so I had to remove this line:

  ReqFeatures.push_back(F.substr(1));

Now the ReqFeatures vector can also refer to strings starting with '+'

Also, added a few tests and comments.


https://reviews.llvm.org/D46541

Files:
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenModule.h
  test/CodeGen/target-features-error-2.c
  test/CodeGen/target-features-error.c

Index: test/CodeGen/target-features-error.c
===
--- test/CodeGen/target-features-error.c
+++ test/CodeGen/target-features-error.c
@@ -3,6 +3,5 @@
   return a + 4;
 }
 int bar() {
-  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'sse4.2', but would be inlined into function 'bar' that is compiled without support for 'sse4.2'}}
+  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'avx', but would be inlined into function 'bar' that is compiled without support for 'avx'}}
 }
-
Index: test/CodeGen/target-features-error-2.c
===
--- test/CodeGen/target-features-error-2.c
+++ test/CodeGen/target-features-error-2.c
@@ -1,38 +1,67 @@
-// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_SSE42
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_1
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_5
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX512f
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -target-feature +movdir64b -S -verify -o - -D NEED_MOVDIRI
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -target-feature +avx512vnni -target-feature +movdiri -S -verify -o - -D NEED_CLWB
 
 #define __MM_MALLOC_H
 #include 
 
-#if NEED_SSE42
+#if NEED_AVX_1
 int baz(__m256i a) {
-  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'sse4.2', but would be inlined into function 'baz' that is compiled without support for 'sse4.2'}}
+  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'avx', but would be inlined into function 'baz' that is compiled without support for 'avx'}}
 }
 #endif
 
-#if NEED_AVX_1
+#if NEED_AVX_2
 __m128 need_avx(__m128 a, __m128 b) {
   return _mm_cmp_ps(a, b, 0); // expected-error {{'__builtin_ia32_cmpps' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_2
+#if NEED_AVX_3
 __m128 need_avx(__m128 a, __m128 b) {
   return _mm_cmp_ss(a, b, 0); // expected-error {{'__builtin_ia32_cmpss' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_3
+#if NEED_AVX_4
 __m128d need_avx(__m128d a, __m128d b) {
   return _mm_cmp_pd(a, b, 0); // expected-error {{'__builtin_ia32_cmppd' needs target feature avx}}
 }
 #endif
 
-#if NEED_AVX_4
+#if NEED_AVX_5
 __m128d need_avx(__m128d a, __m128d b) {
   return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs target feature avx}}
 }
 #endif
+
+#if NEED_AVX512f
+unsigned short need_avx512f(unsigned short a, unsigned short b) {
+  return __builtin_ia32_korhi(a, b); // expected-error {{'__builtin_ia32_korhi' needs target feature avx512f}}
+}
+#endif
+
+#if NEED_MOVDIRI
+void need_movdiri(unsigned int *a, unsigned int b) {
+  __builtin_ia32_directstore_u32(a, b); // expected-error {{'__builtin_ia32_directstore_u32' needs target feature movdiri}}
+}
+#endif
+
+#if NEED_CLWB
+static __inline__ void
+ __attribute__((__always_inline__, __nodebug__,  __target__("avx512vnni,clwb,movdiri,movdir64b")))
+ func(unsigned int *a, unsigned int b)
+{
+  __builtin_ia32_directstore_u32(a, b);
+}
+
+void need_clwb(unsigned int *a, unsigned int b) {
+  func(a, b); // expected-error {{always_inline function 'func' requires target feature 'clwb', but would be inlined into function 'need_clwb' that is compiled without support for 'clwb'}}
+
+}
+#endif
Index: lib/CodeGen/CodeGenModule.h
===
--- lib/CodeGen/CodeGenModule.h
+++ lib/CodeGen/CodeGenModule.h
@@ -1089,6 +1089,10 @@
   /// It's up to you to ensure that this is safe.
   void AddDefaultFnAttrs(llvm::Function &F);
 
+  /// Parses the target attributes passed in, and returns only the ones that are
+  /// valid feature names.
+  TargetAttr::ParsedTargetAttr filterFunctionTargetAttrs(const TargetAttr *TD);
+
   // Fills in the supplied string map with the set of target features for the
   // passed in function.
   void getFunction

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-05-21 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

Ping @efriedma


Repository:
  rC Clang

https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47125: [X86] Remove masking from pternlog llvm intrinsics and use a select instruction instead.

2018-05-21 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

So do these really need to be implemented as macros?


Repository:
  rC Clang

https://reviews.llvm.org/D47125



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46742: [X86] Use __builtin_convertvector to replace some of the avx512 truncate builtins.

2018-05-14 Thread Gabor Buella via Phabricator via cfe-commits

GBuella accepted this revision.
GBuella added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rC Clang

https://reviews.llvm.org/D46742



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-05-14 Thread Gabor Buella via Phabricator via cfe-commits

GBuella requested review of this revision.
GBuella added a comment.

Oops, accepted this by accident.


Repository:
  rC Clang

https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-05-14 Thread Gabor Buella via Phabricator via cfe-commits

GBuella accepted this revision.
GBuella added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rC Clang

https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46683: [X86] Assume alignment of movdir64b dst argument

2018-05-11 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL332091: [X86] Assume alignment of movdir64b dst argument 
(authored by GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Repository:
  rL LLVM

https://reviews.llvm.org/D46683

Files:
  cfe/trunk/lib/Headers/movdirintrin.h
  cfe/trunk/test/CodeGen/builtin-movdir.c


Index: cfe/trunk/test/CodeGen/builtin-movdir.c
===
--- cfe/trunk/test/CodeGen/builtin-movdir.c
+++ cfe/trunk/test/CodeGen/builtin-movdir.c
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple 
x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s 
-emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
-// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown 
-target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | 
FileCheck %s --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown 
-target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | 
FileCheck %s --check-prefix=X86 --check-prefix=CHECK
 
 #include 
 #include 
@@ -22,6 +22,11 @@
 
 void test_dir64b(void *dst, const void *src) {
   // CHECK-LABEL: test_dir64b
+  // CHECK: [[PTRINT1:%.+]] = ptrtoint
+  // X86: [[MASKEDPTR1:%.+]] = and i32 [[PTRINT1]], 63
+  // X86: [[MASKCOND1:%.+]] = icmp eq i32 [[MASKEDPTR1]], 0
+  // X86_64: [[MASKEDPTR1:%.+]] = and i64 [[PTRINT1]], 63
+  // X86_64: [[MASKCOND1:%.+]] = icmp eq i64 [[MASKEDPTR1]], 0
   // CHECK: call void @llvm.x86.movdir64b
   _movdir64b(dst, src);
 }
Index: cfe/trunk/lib/Headers/movdirintrin.h
===
--- cfe/trunk/lib/Headers/movdirintrin.h
+++ cfe/trunk/lib/Headers/movdirintrin.h
@@ -47,10 +47,15 @@
 
 #endif /* __x86_64__ */
 
-// Move 64 bytes as direct store
+/*
+ * movdir64b - Move 64 bytes as direct store.
+ * The destination must be 64 byte aligned, and the store is atomic.
+ * The source address has no alignment requirement, and the load from
+ * the source address is not atomic.
+ */
 static __inline__ void
 __attribute__((__always_inline__, __nodebug__,  __target__("movdir64b")))
-_movdir64b (void *__dst, const void *__src)
+_movdir64b (void *__dst __attribute__((align_value(64))), const void *__src)
 {
   __builtin_ia32_movdir64b(__dst, __src);
 }


Index: cfe/trunk/test/CodeGen/builtin-movdir.c
===
--- cfe/trunk/test/CodeGen/builtin-movdir.c
+++ cfe/trunk/test/CodeGen/builtin-movdir.c
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
-// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86 --check-prefix=CHECK
 
 #include 
 #include 
@@ -22,6 +22,11 @@
 
 void test_dir64b(void *dst, const void *src) {
   // CHECK-LABEL: test_dir64b
+  // CHECK: [[PTRINT1:%.+]] = ptrtoint
+  // X86: [[MASKEDPTR1:%.+]] = and i32 [[PTRINT1]], 63
+  // X86: [[MASKCOND1:%.+]] = icmp eq i32 [[MASKEDPTR1]], 0
+  // X86_64: [[MASKEDPTR1:%.+]] = and i64 [[PTRINT1]], 63
+  // X86_64: [[MASKCOND1:%.+]] = icmp eq i64 [[MASKEDPTR1]], 0
   // CHECK: call void @llvm.x86.movdir64b
   _movdir64b(dst, src);
 }
Index: cfe/trunk/lib/Headers/movdirintrin.h
===
--- cfe/trunk/lib/Headers/movdirintrin.h
+++ cfe/trunk/lib/Headers/movdirintrin.h
@@ -47,10 +47,15 @@
 
 #endif /* __x86_64__ */
 
-// Move 64 bytes as direct store
+/*
+ * movdir64b - Move 64 bytes as direct store.
+ * The destination must be 64 byte aligned, and the store is atomic.
+ * The source address has no alignment requirement, and the load from
+ * the source address is not atomic.
+ */
 static __inline__ void
 __attribute__((__always_inline__, __nodebug__,  __target__("movdir64b")))
-_movdir64b (void *__dst, const void *__src)
+_movdir64b (void *__dst __attribute__((align_value(64))), const void *__src)
 {
   __builtin_ia32_movdir64b(__dst, __src);
 }
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46742: [X86] Use __builtin_convertvector to replace some of the avx512 truncate builtins.

2018-05-11 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D46742#1095658, @tkrupa wrote:

> There are four other similar intrinsics which convert to 128/256-bit vectors:
>
> __m128i _mm256_cvtepi32_epi8 (__m256i a)
>  __m128i _mm256_cvtepi64_epi16 (__m256i a)
>  __m128i _mm256_cvtepi64_epi8 (__m256i a)
>  __m128i _mm512_cvtepi64_epi8 (__m512i a)
>
> Can you also include them?


Probably these should be possible, but e.g. with the _mm256_cvtepi32_epi8 case, 
I can only get this far:

  vpmovdw %ymm0, %xmm0
  vpshufb .LCPI2_0(%rip), %xmm0, %xmm0 # xmm0 = 
xmm0[0,2,4,6,8,10,12,14],zero,zero,zero,zero,zero,zero,zero,zero
  vzeroupper
  retq

While the expected result is a `vpmovdb` instruction, without the extra 
shuffling.


Repository:
  rC Clang

https://reviews.llvm.org/D46742



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46683: [X86] Assume alignment of movdir64b dst argument

2018-05-10 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D46683#1094662, @craig.topper wrote:

> What effect does this have?


Nothing important really, I just guessed it doesn't cost.
One contrived example I could come up with in 2 minutes:

  #include 
  
  void x(char *restrict a __attribute__((align_value(64))), char *restrict b, 
const char *c)
  {
  _movdir64b(b, c);
  
  for (int i = 0; i < 32; ++i)
  a[i] = b[i];
  }

The silly memcpy loop above becomes 2 `movaps` instructions, instead of two 
`movups` instructions.


Repository:
  rC Clang

https://reviews.llvm.org/D46683



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46683: [X86] Assume alignment of movdir64b dst argument

2018-05-10 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added a reviewer: craig.topper.
Herald added a subscriber: cfe-commits.

Repository:
  rC Clang

https://reviews.llvm.org/D46683

Files:
  lib/Headers/movdirintrin.h
  test/CodeGen/builtin-movdir.c


Index: test/CodeGen/builtin-movdir.c
===
--- test/CodeGen/builtin-movdir.c
+++ test/CodeGen/builtin-movdir.c
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple 
x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s 
-emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
-// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown 
-target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | 
FileCheck %s --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown 
-target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | 
FileCheck %s --check-prefix=X86 --check-prefix=CHECK
 
 #include 
 #include 
@@ -22,6 +22,11 @@
 
 void test_dir64b(void *dst, const void *src) {
   // CHECK-LABEL: test_dir64b
+  // CHECK: [[PTRINT1:%.+]] = ptrtoint
+  // X86: [[MASKEDPTR1:%.+]] = and i32 [[PTRINT1]], 63
+  // X86: [[MASKCOND1:%.+]] = icmp eq i32 [[MASKEDPTR1]], 0
+  // X86_64: [[MASKEDPTR1:%.+]] = and i64 [[PTRINT1]], 63
+  // X86_64: [[MASKCOND1:%.+]] = icmp eq i64 [[MASKEDPTR1]], 0
   // CHECK: call void @llvm.x86.movdir64b
   _movdir64b(dst, src);
 }
Index: lib/Headers/movdirintrin.h
===
--- lib/Headers/movdirintrin.h
+++ lib/Headers/movdirintrin.h
@@ -47,10 +47,15 @@
 
 #endif /* __x86_64__ */
 
-// Move 64 bytes as direct store
+/*
+ * movdir64b - Move 64 bytes as direct store.
+ * The destination must be 64 byte aligned, and the store is atomic.
+ * The source address has no alignment requirement, and the load from
+ * the source address is not atomic.
+ */
 static __inline__ void
 __attribute__((__always_inline__, __nodebug__,  __target__("movdir64b")))
-_movdir64b (void *__dst, const void *__src)
+_movdir64b (void *__dst __attribute__((align_value(64))), const void *__src)
 {
   __builtin_ia32_movdir64b(__dst, __src);
 }


Index: test/CodeGen/builtin-movdir.c
===
--- test/CodeGen/builtin-movdir.c
+++ test/CodeGen/builtin-movdir.c
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
-// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86 --check-prefix=CHECK
 
 #include 
 #include 
@@ -22,6 +22,11 @@
 
 void test_dir64b(void *dst, const void *src) {
   // CHECK-LABEL: test_dir64b
+  // CHECK: [[PTRINT1:%.+]] = ptrtoint
+  // X86: [[MASKEDPTR1:%.+]] = and i32 [[PTRINT1]], 63
+  // X86: [[MASKCOND1:%.+]] = icmp eq i32 [[MASKEDPTR1]], 0
+  // X86_64: [[MASKEDPTR1:%.+]] = and i64 [[PTRINT1]], 63
+  // X86_64: [[MASKCOND1:%.+]] = icmp eq i64 [[MASKEDPTR1]], 0
   // CHECK: call void @llvm.x86.movdir64b
   _movdir64b(dst, src);
 }
Index: lib/Headers/movdirintrin.h
===
--- lib/Headers/movdirintrin.h
+++ lib/Headers/movdirintrin.h
@@ -47,10 +47,15 @@
 
 #endif /* __x86_64__ */
 
-// Move 64 bytes as direct store
+/*
+ * movdir64b - Move 64 bytes as direct store.
+ * The destination must be 64 byte aligned, and the store is atomic.
+ * The source address has no alignment requirement, and the load from
+ * the source address is not atomic.
+ */
 static __inline__ void
 __attribute__((__always_inline__, __nodebug__,  __target__("movdir64b")))
-_movdir64b (void *__dst, const void *__src)
+_movdir64b (void *__dst __attribute__((align_value(64))), const void *__src)
 {
   __builtin_ia32_movdir64b(__dst, __src);
 }
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-05-10 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D45616#1093514, @efriedma wrote:

> There is no difference between "signalling" and "non-signalling" unless 
> you're using "#pragma STDC FENV_ACCESS", which is currently not supported.  
> Presumably the work to implement that will include some LLVM IR intrinsic 
> which can encode the difference, but for now we can ignore it.

Does that mean, it is OK to generate the `vcmpltpd` instruction for both of 
these intrinsic calls:

  _mm_cmp_ps_mask(a, b, _CMP_EQ_OQ);
  _mm_cmp_ps_mask(a, b, _CMP_EQ_OS);

?
In that case we can lower both of these to `fcmp olt`.
I'm still not sure, if this is what a user would expect...

Repository:
  rC Clang

https://reviews.llvm.org/D45616

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46540: [X86] ptwrite intrinsic

2018-05-10 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rC331962: [X86] ptwrite intrinsic (authored by GBuella, 
committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D46540?vs=145681&id=146081#toc

Repository:
  rC Clang

https://reviews.llvm.org/D46540

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Basic/BuiltinsX86_64.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/ptwriteintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/ptwrite.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: lib/Basic/Targets/X86.h
===
--- lib/Basic/Targets/X86.h
+++ lib/Basic/Targets/X86.h
@@ -106,6 +106,7 @@
   bool HasWAITPKG = false;
   bool HasMOVDIRI = false;
   bool HasMOVDIR64B = false;
+  bool HasPTWRITE = false;
 
 protected:
   /// Enumeration of all of the X86 CPUs supported by Clang.
Index: lib/Basic/Targets/X86.cpp
===
--- lib/Basic/Targets/X86.cpp
+++ lib/Basic/Targets/X86.cpp
@@ -253,6 +253,7 @@
 setFeatureEnabledImpl(Features, "waitpkg", true);
 LLVM_FALLTHROUGH;
   case CK_GoldmontPlus:
+setFeatureEnabledImpl(Features, "ptwrite", true);
 setFeatureEnabledImpl(Features, "rdpid", true);
 setFeatureEnabledImpl(Features, "sgx", true);
 LLVM_FALLTHROUGH;
@@ -830,6 +831,8 @@
   HasMOVDIR64B = true;
 } else if (Feature == "+pconfig") {
   HasPCONFIG = true;
+} else if (Feature == "+ptwrite") {
+  HasPTWRITE = true;
 }
 
 X86SSEEnum Level = llvm::StringSwitch(Feature)
@@ -1192,6 +1195,8 @@
 Builder.defineMacro("__MOVDIR64B__");
   if (HasPCONFIG)
 Builder.defineMacro("__PCONFIG__");
+  if (HasPTWRITE)
+Builder.defineMacro("__PTWRITE__");
 
   // Each case falls through to the previous one here.
   switch (SSELevel) {
@@ -1326,6 +1331,7 @@
   .Case("popcnt", true)
   .Case("prefetchwt1", true)
   .Case("prfchw", true)
+  .Case("ptwrite", true)
   .Case("rdpid", true)
   .Case("rdrnd", true)
   .Case("rdseed", true)
@@ -1405,6 +1411,7 @@
   .Case("popcnt", HasPOPCNT)
   .Case("prefetchwt1", HasPREFETCHWT1)
   .Case("prfchw", HasPRFCHW)
+  .Case("ptwrite", HasPTWRITE)
   .Case("rdpid", HasRDPID)
   .Case("rdrnd", HasRDRND)
   .Case("rdseed", HasRDSEED)
Index: lib/Headers/cpuid.h
===
--- lib/Headers/cpuid.h
+++ lib/Headers/cpuid.h
@@ -202,6 +202,9 @@
 #define bit_XSAVEC  0x0002
 #define bit_XSAVES  0x0008
 
+/* Features in %eax for leaf 0x14 sub-leaf 0 */
+#define bit_PTWRITE 0x0010
+
 /* Features in %ecx for leaf 0x8001 */
 #define bit_LAHF_LM 0x0001
 #define bit_ABM 0x0020
Index: lib/Headers/module.modulemap
===
--- lib/Headers/module.modulemap
+++ lib/Headers/module.modulemap
@@ -69,6 +69,7 @@
 textual header "movdirintrin.h"
 textual header "pconfigintrin.h"
 textual header "sgxintrin.h"
+textual header "ptwriteintrin.h"
 
 explicit module mm_malloc {
   requires !freestanding
Index: lib/Headers/ptwriteintrin.h
===
--- lib/Headers/ptwriteintrin.h
+++ lib/Headers/ptwriteintrin.h
@@ -0,0 +1,51 @@
+/*=== ptwriteintrin.h - PTWRITE intrinsic ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef __PTWRITEINTRIN_H
+#define __PTWRITEINTRIN_H
+
+/* Define

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-05-09 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D45616#1067492, @efriedma wrote:

> > The fcmp opcode has no defined behavior with NaN operands in the 
> > comparisions handled in this patch.
>
> Could you describe the problem here in a bit more detail?  As far as I know, 
> the LLVM IR fcmp should return the same result as the x86 CMPPD, even without 
> fast-math.

So, I'm still looking into this.
What I see is, yes, fcmp just so happens to work the same as x86 CMPPD.
An example:

  fcmp olt <2 x double> %x, %y

becomes vcmpltpd.

But this only holds for condition codes 0 - 7.

Where LLVM IR has a condition "olt" <- ordered less-than, x86 cmppd has two 
corresponding condition codes: 0x01->"less-than (ordered, signaling)", which is 
"vcmpltpd" and 0x11->"less-than (ordered, nonsignaling)" which is  "vcmplt_oqps"

Now, if the builtin's CC argument is 1 (which refers to vcmpltps), we lower it 
to "fcmp olt", which then results in "vcmpltps", we are ok, yes.
But in the IR, there is no information about the user expecting "vcmpltps" vs 
"vcmplt_oqps".

Do I understand these tricks right?
If we are ok with this (hard to understand) approach, I can just lower these 
without fast-math as well, as long as CC < 8, by modifying this condition:

  if (CC < 8 && !UsesNonDefaultRounding && getLangOpts().FastMath) {

Although, I'm still looked at what happens with sNaN, and with qNaN constants, 
once these comparisons are lowered to fcmp.

Repository:
  rC Clang

https://reviews.llvm.org/D45616

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46540: [X86] ptwrite intrinsic

2018-05-09 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D46540#1091625, @Hahnfeld wrote:

> Could you maybe add some short summaries to your patches? It's hard for 
> non-Intel employees to guess what all these instructions do...

Well, I was thinking I could copy-paste this from 
https://software.intel.com/en-us/articles/intel-sdm :
"This instruction reads data in the source operand and sends it to the Intel 
Processor Trace hardware to be encoded
in a PTW packet if TriggerEn, ContextEn, FilterEn, and PTWEn are all set to 1. 
For more details on these values, see
Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C, 
Section 35.2.2, “Software Trace
Instrumentation with PTWRITE”."

Do you think this would really help anyone? It appears to be just meaningless 
without larger context.
Those who ever need this, need to read a lot of these manuals anyways, I think 
noone in practice is going to be enlightened by such a short description.

That of course makes a lot more sense with simpler instructions, e.g. movdir64b 
- I can just describe that as something like "atomically moving 64 bytes".

https://reviews.llvm.org/D46540

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-05-09 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 145876.

https://reviews.llvm.org/D46541

Files:
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenModule.h
  test/CodeGen/target-features-error-2.c
  test/CodeGen/target-features-error.c

Index: test/CodeGen/target-features-error.c
===
--- test/CodeGen/target-features-error.c
+++ test/CodeGen/target-features-error.c
@@ -3,6 +3,5 @@
   return a + 4;
 }
 int bar() {
-  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'sse4.2', but would be inlined into function 'bar' that is compiled without support for 'sse4.2'}}
+  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'avx', but would be inlined into function 'bar' that is compiled without support for 'avx'}}
 }
-
Index: test/CodeGen/target-features-error-2.c
===
--- test/CodeGen/target-features-error-2.c
+++ test/CodeGen/target-features-error-2.c
@@ -3,13 +3,14 @@
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX512f
 
 #define __MM_MALLOC_H
 #include 
 
 #if NEED_SSE42
 int baz(__m256i a) {
-  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'sse4.2', but would be inlined into function 'baz' that is compiled without support for 'sse4.2'}}
+  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'avx', but would be inlined into function 'baz' that is compiled without support for 'avx'}}
 }
 #endif
 
@@ -36,3 +37,9 @@
   return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs target feature avx}}
 }
 #endif
+
+#if NEED_AVX512f
+unsigned short need_avx512f(unsigned short a, unsigned short b) {
+  return __builtin_ia32_korhi(a, b); // expected-error {{'__builtin_ia32_korhi' needs target feature avx512f}}
+}
+#endif
Index: lib/CodeGen/CodeGenModule.h
===
--- lib/CodeGen/CodeGenModule.h
+++ lib/CodeGen/CodeGenModule.h
@@ -1082,6 +1082,8 @@
   /// It's up to you to ensure that this is safe.
   void AddDefaultFnAttrs(llvm::Function &F);
 
+  TargetAttr::ParsedTargetAttr filterFunctionTargetAttrs(const TargetAttr *TD);
+
   // Fills in the supplied string map with the set of target features for the
   // passed in function.
   void getFunctionFeatureMap(llvm::StringMap &FeatureMap,
Index: lib/CodeGen/CodeGenModule.cpp
===
--- lib/CodeGen/CodeGenModule.cpp
+++ lib/CodeGen/CodeGenModule.cpp
@@ -4995,22 +4995,28 @@
   }
 }
 
+TargetAttr::ParsedTargetAttr CodeGenModule::filterFunctionTargetAttrs(const TargetAttr *TD) {
+  assert(TD != nullptr);
+  TargetAttr::ParsedTargetAttr ParsedAttr = TD->parse();
+
+  ParsedAttr.Features.erase(
+  llvm::remove_if(ParsedAttr.Features,
+  [&](const std::string &Feat) {
+return !Target.isValidFeatureName(
+StringRef{Feat}.substr(1));
+  }),
+  ParsedAttr.Features.end());
+  return ParsedAttr;
+}
+
+
 // Fills in the supplied string map with the set of target features for the
 // passed in function.
 void CodeGenModule::getFunctionFeatureMap(llvm::StringMap &FeatureMap,
   const FunctionDecl *FD) {
   StringRef TargetCPU = Target.getTargetOpts().CPU;
   if (const auto *TD = FD->getAttr()) {
-// If we have a TargetAttr build up the feature map based on that.
-TargetAttr::ParsedTargetAttr ParsedAttr = TD->parse();
-
-ParsedAttr.Features.erase(
-llvm::remove_if(ParsedAttr.Features,
-[&](const std::string &Feat) {
-  return !Target.isValidFeatureName(
-  StringRef{Feat}.substr(1));
-}),
-ParsedAttr.Features.end());
+TargetAttr::ParsedTargetAttr ParsedAttr = filterFunctionTargetAttrs(TD);
 
 // Make a copy of the features as passed on the command line into the
 // beginning of the additional features from the function to override.
Index: lib/CodeGen/CodeGenFunction.cpp
===
--- lib/CodeGen/CodeGenFunction.cpp
+++ lib/CodeGen/CodeGenFunction.cpp
@@ -2330,9 +2330,19 @@
 
   } else if (TargetDecl->hasAttr()) {
 // Get the required features for the callee.
+
+const TargetAttr *TD = TargetDecl->getAttr();
+TargetAttr::ParsedTargetAttr ParsedAttr = CGM.filterFunctionTargetAttrs(

[PATCH] D46540: [X86] ptwrite intrinsic

2018-05-08 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 145681.
GBuella added a comment.

Rebased.


https://reviews.llvm.org/D46540

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Basic/BuiltinsX86_64.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/ptwriteintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/ptwrite.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1402,6 +1402,7 @@
 // CHECK_GLMP_M32: #define __PCLMUL__ 1
 // CHECK_GLMP_M32: #define __POPCNT__ 1
 // CHECK_GLMP_M32: #define __PRFCHW__ 1
+// CHECK_GLMP_M32: #define __PTWRITE__ 1
 // CHECK_GLMP_M32: #define __RDPID__ 1
 // CHECK_GLMP_M32: #define __RDRND__ 1
 // CHECK_GLMP_M32: #define __RDSEED__ 1
@@ -1437,6 +1438,7 @@
 // CHECK_GLMP_M64: #define __PCLMUL__ 1
 // CHECK_GLMP_M64: #define __POPCNT__ 1
 // CHECK_GLMP_M64: #define __PRFCHW__ 1
+// CHECK_GLMP_M64: #define __PTWRITE__ 1
 // CHECK_GLMP_M64: #define __RDPID__ 1
 // CHECK_GLMP_M64: #define __RDRND__ 1
 // CHECK_GLMP_M64: #define __RDSEED__ 1
@@ -1474,6 +1476,7 @@
 // CHECK_TRM_M32: #define __PCLMUL__ 1
 // CHECK_TRM_M32: #define __POPCNT__ 1
 // CHECK_TRM_M32: #define __PRFCHW__ 1
+// CHECK_TRM_M32: #define __PTWRITE__ 1
 // CHECK_TRM_M32: #define __RDPID__ 1
 // CHECK_TRM_M32: #define __RDRND__ 1
 // CHECK_TRM_M32: #define __RDSEED__ 1
@@ -1514,6 +1517,7 @@
 // CHECK_TRM_M64: #define __PCLMUL__ 1
 // CHECK_TRM_M64: #define __POPCNT__ 1
 // CHECK_TRM_M64: #define __PRFCHW__ 1
+// CHECK_TRM_M64: #define __PTWRITE__ 1
 // CHECK_TRM_M64: #define __RDPID__ 1
 // CHECK_TRM_M64: #define __RDRND__ 1
 // CHECK_TRM_M64: #define __RDSEED__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -164,3 +164,8 @@
 // RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-pconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PCONFIG %s
 // PCONFIG: "-target-feature" "+pconfig"
 // NO-PCONFIG: "-target-feature" "-pconfig"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mptwrite %s -### -o %t.o 2>&1 | FileCheck -check-prefix=PTWRITE %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-ptwrite %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PTWRITE %s
+// PTWRITE: "-target-feature" "+ptwrite"
+// NO-PTWRITE: "-target-feature" "-ptwrite"
Index: test/CodeGen/ptwrite.c
===
--- /dev/null
+++ test/CodeGen/ptwrite.c
@@ -0,0 +1,22 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -target-feature +ptwrite -emit-llvm -o - -Wall -Werror -pedantic | FileCheck %s --check-prefix=X86 --check-prefix=X86_64
+// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown -target-feature +ptwrite -emit-llvm -o - -Wall -Werror -pedantic | FileCheck %s --check-prefix=X86
+
+#include 
+
+#include 
+
+void test_ptwrite32(uint32_t value) {
+  //X86-LABEL: @test_ptwrite32
+  //X86: call void @llvm.x86.ptwrite32(i32 %{{.*}})
+  _ptwrite32(value);
+}
+
+#ifdef __x86_64__
+
+void test_ptwrite64(uint64_t value) {
+  //X86_64-LABEL: @test_ptwrite64
+  //X86_64: call void @llvm.x86.ptwrite64(i64 %{{.*}})
+  _ptwrite64(value);
+}
+
+#endif /* __x86_64__ */
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -113,4 +113,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__PTWRITE__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/ptwriteintrin.h
===
--- /dev/null
+++ lib/Headers/ptwriteintrin.h
@@ -0,0 +1,51 @@
+/*=== ptwriteintrin.h - PTWRITE intrinsic ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND N

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-05-08 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 145680.

https://reviews.llvm.org/D46541

Files:
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenModule.h
  test/CodeGen/target-features-error-2.c
  test/CodeGen/target-features-error.c

Index: test/CodeGen/target-features-error.c
===
--- test/CodeGen/target-features-error.c
+++ test/CodeGen/target-features-error.c
@@ -3,6 +3,5 @@
   return a + 4;
 }
 int bar() {
-  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'sse4.2', but would be inlined into function 'bar' that is compiled without support for 'sse4.2'}}
+  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'avx', but would be inlined into function 'bar' that is compiled without support for 'avx'}}
 }
-
Index: test/CodeGen/target-features-error-2.c
===
--- test/CodeGen/target-features-error-2.c
+++ test/CodeGen/target-features-error-2.c
@@ -3,13 +3,14 @@
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX512f
 
 #define __MM_MALLOC_H
 #include 
 
 #if NEED_SSE42
 int baz(__m256i a) {
-  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'sse4.2', but would be inlined into function 'baz' that is compiled without support for 'sse4.2'}}
+  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'avx', but would be inlined into function 'baz' that is compiled without support for 'avx'}}
 }
 #endif
 
@@ -36,3 +37,9 @@
   return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs target feature avx}}
 }
 #endif
+
+#if NEED_AVX512f
+unsigned short need_avx512f(unsigned short a, unsigned short b) {
+  return __builtin_ia32_korhi(a, b); // expected-error {{'__builtin_ia32_korhi' needs target feature avx512f}}
+}
+#endif
Index: lib/CodeGen/CodeGenModule.h
===
--- lib/CodeGen/CodeGenModule.h
+++ lib/CodeGen/CodeGenModule.h
@@ -1082,6 +1082,8 @@
   /// It's up to you to ensure that this is safe.
   void AddDefaultFnAttrs(llvm::Function &F);
 
+  TargetAttr::ParsedTargetAttr filterFunctionTargetAttrs(const TargetAttr *TD);
+
   // Fills in the supplied string map with the set of target features for the
   // passed in function.
   void getFunctionFeatureMap(llvm::StringMap &FeatureMap,
Index: lib/CodeGen/CodeGenModule.cpp
===
--- lib/CodeGen/CodeGenModule.cpp
+++ lib/CodeGen/CodeGenModule.cpp
@@ -4995,22 +4995,28 @@
   }
 }
 
+TargetAttr::ParsedTargetAttr CodeGenModule::filterFunctionTargetAttrs(const TargetAttr *TD) {
+  assert(TD != nullptr);
+  TargetAttr::ParsedTargetAttr ParsedAttr = TD->parse();
+
+  ParsedAttr.Features.erase(
+  llvm::remove_if(ParsedAttr.Features,
+  [&](const std::string &Feat) {
+return !Target.isValidFeatureName(
+StringRef{Feat}.substr(1));
+  }),
+  ParsedAttr.Features.end());
+  return ParsedAttr;
+}
+
+
 // Fills in the supplied string map with the set of target features for the
 // passed in function.
 void CodeGenModule::getFunctionFeatureMap(llvm::StringMap &FeatureMap,
   const FunctionDecl *FD) {
   StringRef TargetCPU = Target.getTargetOpts().CPU;
   if (const auto *TD = FD->getAttr()) {
-// If we have a TargetAttr build up the feature map based on that.
-TargetAttr::ParsedTargetAttr ParsedAttr = TD->parse();
-
-ParsedAttr.Features.erase(
-llvm::remove_if(ParsedAttr.Features,
-[&](const std::string &Feat) {
-  return !Target.isValidFeatureName(
-  StringRef{Feat}.substr(1));
-}),
-ParsedAttr.Features.end());
+TargetAttr::ParsedTargetAttr ParsedAttr = filterFunctionTargetAttrs(TD);
 
 // Make a copy of the features as passed on the command line into the
 // beginning of the additional features from the function to override.
Index: lib/CodeGen/CodeGenFunction.cpp
===
--- lib/CodeGen/CodeGenFunction.cpp
+++ lib/CodeGen/CodeGenFunction.cpp
@@ -2330,13 +2330,23 @@
 
   } else if (TargetDecl->hasAttr()) {
 // Get the required features for the callee.
+
+const TargetAttr *TD = TargetDecl->getAttr();
+TargetAttr::ParsedTargetAttr ParsedAttr = CGM.filterFunctionTargetAttrs

[PATCH] D46435: [x86] Introduce the encl[u|s|v] intrinsics

2018-05-08 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rC331743: [x86] Introduce the encl[u|s|v] intrinsics (authored 
by GBuella, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D46435?vs=145205&id=145638#toc

Repository:
  rC Clang

https://reviews.llvm.org/D46435

Files:
  lib/Headers/CMakeLists.txt
  lib/Headers/module.modulemap
  lib/Headers/sgxintrin.h
  lib/Headers/x86intrin.h
  test/Headers/sgxintrin.c

Index: test/Headers/sgxintrin.c
===
--- test/Headers/sgxintrin.c
+++ test/Headers/sgxintrin.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -target-feature +sgx -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386 -target-feature +sgx -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-32
+
+#include 
+#include 
+#include 
+
+uint32_t test_encls(uint32_t leaf, size_t data[3]) {
+// CHECK-64: call { i32, i64, i64, i64 } asm "encls", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i64 %{{.*}}, i64 %{{.*}}, i64 %{{.*}})
+// CHECK-32: call { i32, i32, i32, i32 } asm "encls", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+
+  return _encls_u32(leaf, data);
+}
+
+uint32_t test_enclu(uint32_t leaf, size_t data[3]) {
+// CHECK-64: call { i32, i64, i64, i64 } asm "enclu", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i64 %{{.*}}, i64 %{{.*}}, i64 %{{.*}})
+// CHECK-32: call { i32, i32, i32, i32 } asm "enclu", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+
+  return _enclu_u32(leaf, data);
+}
+
+uint32_t test_enclv(uint32_t leaf, size_t data[3]) {
+// CHECK-64: call { i32, i64, i64, i64 } asm "enclv", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i64 %{{.*}}, i64 %{{.*}}, i64 %{{.*}})
+// CHECK-32: call { i32, i32, i32, i32 } asm "enclv", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+
+  return _enclv_u32(leaf, data);
+}
Index: lib/Headers/sgxintrin.h
===
--- lib/Headers/sgxintrin.h
+++ lib/Headers/sgxintrin.h
@@ -0,0 +1,70 @@
+/*=== sgxintrin.h - X86 SGX intrinsics configuration ---===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef __SGXINTRIN_H
+#define __SGXINTRIN_H
+
+/* Define the default attributes for the functions in this file. */
+#define __DEFAULT_FN_ATTRS \
+  __attribute__((__always_inline__, __nodebug__,  __target__("sgx")))
+
+static __inline unsigned int __DEFAULT_FN_ATTRS
+_enclu_u32(unsigned int __leaf, __SIZE_TYPE__ __d[])
+{
+  unsigned int __result;
+  __asm__ ("enclu"
+   : "=a" (__result), "=b" (__d[0]), "=c" (__d[1]), "=d" (__d[2])
+   : "a" (__leaf), "b" (__d[0]), "c" (__d[1]), "d" (__d[2])
+   : "cc");
+  return __result;
+}
+
+static __inline unsigned int __DEFAULT_FN_ATTRS
+_encls_u32(unsigned int __leaf, __SIZE_TYPE__ __d[])
+{
+  unsigned int __result;
+  __asm__ ("encls"
+   : "=a" (__result), "=b" (__d[0]), "=c" (__d[1]), "=d" (__d[2])
+   : "a" (__leaf), "b" (__d[0]), "c" (__d[1]), "d" (__d[2])
+   : "cc");
+  return __result;
+}
+
+static __inline unsigned int __DEFAULT_FN_ATTRS
+_enclv_u32(unsigned int __leaf, __SIZE_TYPE__ __d[])
+{
+  unsigned int __result;
+  __asm__ ("enclv"
+   : "=a" (__result), "=b" (__d[0]), "=c" (__d[1]), "

[PATCH] D46431: [x86] Introduce the pconfig intrinsic

2018-05-07 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rC331740: [x86] Introduce the pconfig intrinsic (authored by 
GBuella, committed by ).

Repository:
  rC Clang

https://reviews.llvm.org/D46431

Files:
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/pconfigintrin.h
  lib/Headers/x86intrin.h
  test/Driver/x86-target-features.c
  test/Headers/pconfigintin.c
  test/Preprocessor/predefined-arch-macros.c

Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -105,4 +105,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__PCONFIG__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/cpuid.h
===
--- lib/Headers/cpuid.h
+++ lib/Headers/cpuid.h
@@ -194,6 +194,7 @@
 /* Features in %edx for leaf 7 sub-leaf 0 */
 #define bit_AVX5124VNNIW  0x0004
 #define bit_AVX5124FMAPS  0x0008
+#define bit_PCONFIG   0x0004
 #define bit_IBT   0x0010
 
 /* Features in %eax for leaf 13 sub-leaf 1 */
Index: lib/Headers/pconfigintrin.h
===
--- lib/Headers/pconfigintrin.h
+++ lib/Headers/pconfigintrin.h
@@ -0,0 +1,50 @@
+/*=== pconfigintrin.h - X86 platform configuration -===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef __PCONFIGINTRIN_H
+#define __PCONFIGINTRIN_H
+
+#define __PCONFIG_KEY_PROGRAM 0x0001
+
+/* Define the default attributes for the functions in this file. */
+#define __DEFAULT_FN_ATTRS \
+  __attribute__((__always_inline__, __nodebug__,  __target__("pconfig")))
+
+static __inline unsigned int __DEFAULT_FN_ATTRS
+_pconfig_u32(unsigned int __leaf, __SIZE_TYPE__ __d[])
+{
+  unsigned int __result;
+  __asm__ ("pconfig"
+   : "=a" (__result), "=b" (__d[0]), "=c" (__d[1]), "=d" (__d[2])
+   : "a" (__leaf), "b" (__d[0]), "c" (__d[1]), "d" (__d[2])
+   : "cc");
+  return __result;
+}
+
+#undef __DEFAULT_FN_ATTRS
+
+#endif
Index: lib/Headers/CMakeLists.txt
===
--- lib/Headers/CMakeLists.txt
+++ lib/Headers/CMakeLists.txt
@@ -73,6 +73,7 @@
   opencl-c.h
   pkuintrin.h
   pmmintrin.h
+  pconfigintrin.h
   popcntintrin.h
   prfchwintrin.h
   rdseedintrin.h
Index: lib/Headers/module.modulemap
===
--- lib/Headers/module.modulemap
+++ lib/Headers/module.modulemap
@@ -67,6 +67,7 @@
 textual header "cldemoteintrin.h"
 textual header "waitpkgintrin.h"
 textual header "movdirintrin.h"
+textual header "pconfigintrin.h"
 
 explicit module mm_malloc {
   requires !freestanding
Index: lib/Basic/Targets/X86.cpp
===
--- lib/Basic/Targets/X86.cpp
+++ lib/Basic/Targets/X86.cpp
@@ -154,6 +154,7 @@
 break;
 
   case CK_IcelakeServer:
+setFeatureEnabledImpl(Features, "pconfig", true);
 setFeatureEnabledImpl(Features, "wbnoinvd", true);
 LLVM_FALLTHROUGH;
   case CK_IcelakeClient:
@@ -827,6 +828,8 @@
   HasMOVDIRI = true;
 } else if (Feature == "+movdir64b") {
   HasMOVDIR64B = true;
+} else if (Feature == "+pconfig") {
+  HasPCONFIG = true;
 }
 
 X86SSEEnum Level = llvm::StringSwitch(Feature)
@@ -1187,6 +1190,8 @@
 Builder.defineMacro("__MOVDIRI__");
   if (HasMOVDIR64B)
 Builder.defineMacro("__MOVDIR64B__");
+  if (HasPCONFIG)
+Builder.defineMac

[PATCH] D46431: [x86] Introduce the pconfig intrinsic

2018-05-07 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL331740: [x86] Introduce the pconfig intrinsic (authored by 
GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D46431?vs=145207&id=145634#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D46431

Files:
  cfe/trunk/include/clang/Driver/Options.td
  cfe/trunk/lib/Basic/Targets/X86.cpp
  cfe/trunk/lib/Basic/Targets/X86.h
  cfe/trunk/lib/Headers/CMakeLists.txt
  cfe/trunk/lib/Headers/cpuid.h
  cfe/trunk/lib/Headers/module.modulemap
  cfe/trunk/lib/Headers/pconfigintrin.h
  cfe/trunk/lib/Headers/x86intrin.h
  cfe/trunk/test/Driver/x86-target-features.c
  cfe/trunk/test/Headers/pconfigintin.c
  cfe/trunk/test/Preprocessor/predefined-arch-macros.c

Index: cfe/trunk/lib/Headers/pconfigintrin.h
===
--- cfe/trunk/lib/Headers/pconfigintrin.h
+++ cfe/trunk/lib/Headers/pconfigintrin.h
@@ -0,0 +1,50 @@
+/*=== pconfigintrin.h - X86 platform configuration -===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef __PCONFIGINTRIN_H
+#define __PCONFIGINTRIN_H
+
+#define __PCONFIG_KEY_PROGRAM 0x0001
+
+/* Define the default attributes for the functions in this file. */
+#define __DEFAULT_FN_ATTRS \
+  __attribute__((__always_inline__, __nodebug__,  __target__("pconfig")))
+
+static __inline unsigned int __DEFAULT_FN_ATTRS
+_pconfig_u32(unsigned int __leaf, __SIZE_TYPE__ __d[])
+{
+  unsigned int __result;
+  __asm__ ("pconfig"
+   : "=a" (__result), "=b" (__d[0]), "=c" (__d[1]), "=d" (__d[2])
+   : "a" (__leaf), "b" (__d[0]), "c" (__d[1]), "d" (__d[2])
+   : "cc");
+  return __result;
+}
+
+#undef __DEFAULT_FN_ATTRS
+
+#endif
Index: cfe/trunk/lib/Headers/CMakeLists.txt
===
--- cfe/trunk/lib/Headers/CMakeLists.txt
+++ cfe/trunk/lib/Headers/CMakeLists.txt
@@ -73,6 +73,7 @@
   opencl-c.h
   pkuintrin.h
   pmmintrin.h
+  pconfigintrin.h
   popcntintrin.h
   prfchwintrin.h
   rdseedintrin.h
Index: cfe/trunk/lib/Headers/module.modulemap
===
--- cfe/trunk/lib/Headers/module.modulemap
+++ cfe/trunk/lib/Headers/module.modulemap
@@ -67,6 +67,7 @@
 textual header "cldemoteintrin.h"
 textual header "waitpkgintrin.h"
 textual header "movdirintrin.h"
+textual header "pconfigintrin.h"
 
 explicit module mm_malloc {
   requires !freestanding
Index: cfe/trunk/lib/Headers/x86intrin.h
===
--- cfe/trunk/lib/Headers/x86intrin.h
+++ cfe/trunk/lib/Headers/x86intrin.h
@@ -105,4 +105,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__PCONFIG__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: cfe/trunk/lib/Headers/cpuid.h
===
--- cfe/trunk/lib/Headers/cpuid.h
+++ cfe/trunk/lib/Headers/cpuid.h
@@ -194,6 +194,7 @@
 /* Features in %edx for leaf 7 sub-leaf 0 */
 #define bit_AVX5124VNNIW  0x0004
 #define bit_AVX5124FMAPS  0x0008
+#define bit_PCONFIG   0x0004
 #define bit_IBT   0x0010
 
 /* Features in %eax for leaf 13 sub-leaf 1 */
Index: cfe/trunk/lib/Basic/Targets/X86.h
===
--- cfe/trunk/lib/Basic/Targets/X86.h
+++ cfe/trunk/lib/Basic/Targets/X86.h
@@ -92,6 +92,7 @@
   bool HasMWAITX = false;
   bool HasCLZERO = false;
   bool HasCLDEMOTE = false;
+  bool HasPCONFIG = false;
   bool HasPKU = false;
   bool HasCLFLUSHOPT = false;
   bool HasCLWB = false;
Index: cfe/trunk

[PATCH] D46541: [CodeGen] Improve diagnostics related to target attributes

2018-05-07 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added reviewers: craig.topper, echristo, dblaikie.
Herald added a subscriber: cfe-commits.

When requirement imposed by __target__ attributes on functions
are not satisfied, prefer printing those requirements, which
are explicitly mentioned in the attributes.

This makes such messages more useful, e.g. printing avx512f instead of avx2
in the following scenario:

  $ cat foo.c
  static inline void __attribute__((__always_inline__, __target__("avx512f")))
  x(void)
  {
  }
  
  int main(void)
  {
x();
  }
  $ clang foo.c
  foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', 
but would be inlined into function 'main' that is compiled without support for 
'avx2'
  x();
^
  1 error generated.

bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338


Repository:
  rC Clang

https://reviews.llvm.org/D46541

Files:
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenModule.cpp
  lib/CodeGen/CodeGenModule.h
  test/CodeGen/target-features-error-2.c
  test/CodeGen/target-features-error.c

Index: test/CodeGen/target-features-error.c
===
--- test/CodeGen/target-features-error.c
+++ test/CodeGen/target-features-error.c
@@ -3,6 +3,5 @@
   return a + 4;
 }
 int bar() {
-  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'sse4.2', but would be inlined into function 'bar' that is compiled without support for 'sse4.2'}}
+  return foo(4); // expected-error {{always_inline function 'foo' requires target feature 'avx', but would be inlined into function 'bar' that is compiled without support for 'avx'}}
 }
-
Index: test/CodeGen/target-features-error-2.c
===
--- test/CodeGen/target-features-error-2.c
+++ test/CodeGen/target-features-error-2.c
@@ -3,13 +3,14 @@
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_2
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_3
 // RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX_4
+// RUN: %clang_cc1 %s -triple=x86_64-linux-gnu -S -verify -o - -D NEED_AVX512f
 
 #define __MM_MALLOC_H
 #include 
 
 #if NEED_SSE42
 int baz(__m256i a) {
-  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'sse4.2', but would be inlined into function 'baz' that is compiled without support for 'sse4.2'}}
+  return _mm256_extract_epi32(a, 3); // expected-error {{always_inline function '_mm256_extract_epi32' requires target feature 'avx', but would be inlined into function 'baz' that is compiled without support for 'avx'}}
 }
 #endif
 
@@ -36,3 +37,9 @@
   return _mm_cmp_sd(a, b, 0); // expected-error {{'__builtin_ia32_cmpsd' needs target feature avx}}
 }
 #endif
+
+#if NEED_AVX512f
+unsigned short need_avx512f(unsigned short a, unsigned short b) {
+  return __builtin_ia32_korhi(a, b); // expected-error {{'__builtin_ia32_korhi' needs target feature avx512f}}
+}
+#endif
Index: lib/CodeGen/CodeGenModule.h
===
--- lib/CodeGen/CodeGenModule.h
+++ lib/CodeGen/CodeGenModule.h
@@ -1082,6 +1082,8 @@
   /// It's up to you to ensure that this is safe.
   void AddDefaultFnAttrs(llvm::Function &F);
 
+  TargetAttr::ParsedTargetAttr getFunctionTargetAttrs(const FunctionDecl *FD);
+
   // Fills in the supplied string map with the set of target features for the
   // passed in function.
   void getFunctionFeatureMap(llvm::StringMap &FeatureMap,
Index: lib/CodeGen/CodeGenModule.cpp
===
--- lib/CodeGen/CodeGenModule.cpp
+++ lib/CodeGen/CodeGenModule.cpp
@@ -4995,11 +4995,8 @@
   }
 }
 
-// Fills in the supplied string map with the set of target features for the
-// passed in function.
-void CodeGenModule::getFunctionFeatureMap(llvm::StringMap &FeatureMap,
-  const FunctionDecl *FD) {
-  StringRef TargetCPU = Target.getTargetOpts().CPU;
+TargetAttr::ParsedTargetAttr CodeGenModule::getFunctionTargetAttrs(const FunctionDecl *FD)
+{
   if (const auto *TD = FD->getAttr()) {
 // If we have a TargetAttr build up the feature map based on that.
 TargetAttr::ParsedTargetAttr ParsedAttr = TD->parse();
@@ -5011,7 +5008,20 @@
   StringRef{Feat}.substr(1));
 }),
 ParsedAttr.Features.end());
+return ParsedAttr;
+  } else {
+return TargetAttr::ParsedTargetAttr();
+  }
+}
+
 
+// Fills in the supplied string map with the set of target features for the
+// passed in function.
+void CodeGenModule::getFunctionFeatureMap(llvm::StringMap &FeatureMap,
+  const FunctionDecl *FD) {
+  StringRef TargetCPU = Target.getTargetOpts().CPU;
+  TargetAttr::ParsedTargetAttr ParsedA

[PATCH] D46540: [X86] ptwrite intrinsic

2018-05-07 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added a reviewer: craig.topper.
Herald added subscribers: cfe-commits, mgorny.

Repository:
  rC Clang

https://reviews.llvm.org/D46540

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Basic/BuiltinsX86_64.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/ptwriteintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/ptwrite.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1400,6 +1400,7 @@
 // CHECK_GLMP_M32: #define __PCLMUL__ 1
 // CHECK_GLMP_M32: #define __POPCNT__ 1
 // CHECK_GLMP_M32: #define __PRFCHW__ 1
+// CHECK_GLMP_M32: #define __PTWRITE__ 1
 // CHECK_GLMP_M32: #define __RDPID__ 1
 // CHECK_GLMP_M32: #define __RDRND__ 1
 // CHECK_GLMP_M32: #define __RDSEED__ 1
@@ -1435,6 +1436,7 @@
 // CHECK_GLMP_M64: #define __PCLMUL__ 1
 // CHECK_GLMP_M64: #define __POPCNT__ 1
 // CHECK_GLMP_M64: #define __PRFCHW__ 1
+// CHECK_GLMP_M64: #define __PTWRITE__ 1
 // CHECK_GLMP_M64: #define __RDPID__ 1
 // CHECK_GLMP_M64: #define __RDRND__ 1
 // CHECK_GLMP_M64: #define __RDSEED__ 1
@@ -1472,6 +1474,7 @@
 // CHECK_TRM_M32: #define __PCLMUL__ 1
 // CHECK_TRM_M32: #define __POPCNT__ 1
 // CHECK_TRM_M32: #define __PRFCHW__ 1
+// CHECK_TRM_M32: #define __PTWRITE__ 1
 // CHECK_TRM_M32: #define __RDPID__ 1
 // CHECK_TRM_M32: #define __RDRND__ 1
 // CHECK_TRM_M32: #define __RDSEED__ 1
@@ -1512,6 +1515,7 @@
 // CHECK_TRM_M64: #define __PCLMUL__ 1
 // CHECK_TRM_M64: #define __POPCNT__ 1
 // CHECK_TRM_M64: #define __PRFCHW__ 1
+// CHECK_TRM_M64: #define __PTWRITE__ 1
 // CHECK_TRM_M64: #define __RDPID__ 1
 // CHECK_TRM_M64: #define __RDRND__ 1
 // CHECK_TRM_M64: #define __RDSEED__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -159,3 +159,8 @@
 // RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
 // MOVDIR64B: "-target-feature" "+movdir64b"
 // NO-MOVDIR64B: "-target-feature" "-movdir64b"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mptwrite %s -### -o %t.o 2>&1 | FileCheck -check-prefix=PTWRITE %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-ptwrite %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PTWRITE %s
+// PTWRITE: "-target-feature" "+ptwrite"
+// NO-PTWRITE: "-target-feature" "-ptwrite"
Index: test/CodeGen/ptwrite.c
===
--- /dev/null
+++ test/CodeGen/ptwrite.c
@@ -0,0 +1,22 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -target-feature +ptwrite -emit-llvm -o - -Wall -Werror -pedantic | FileCheck %s --check-prefix=X86 --check-prefix=X86_64
+// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown -target-feature +ptwrite -emit-llvm -o - -Wall -Werror -pedantic | FileCheck %s --check-prefix=X86
+
+#include 
+
+#include 
+
+void test_ptwrite32(uint32_t value) {
+  //X86-LABEL: @test_ptwrite32
+  //X86: call void @llvm.x86.ptwrite32(i32 %{{.*}})
+  _ptwrite32(value);
+}
+
+#ifdef __x86_64__
+
+void test_ptwrite64(uint64_t value) {
+  //X86_64-LABEL: @test_ptwrite64
+  //X86_64: call void @llvm.x86.ptwrite64(i64 %{{.*}})
+  _ptwrite64(value);
+}
+
+#endif /* __x86_64__ */
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -105,4 +105,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__PTWRITE__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/ptwriteintrin.h
===
--- /dev/null
+++ lib/Headers/ptwriteintrin.h
@@ -0,0 +1,51 @@
+/*=== ptwriteintrin.h - PTWRITE intrinsic ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE W

[PATCH] D46431: [x86] Introduce the pconfig intrinsic

2018-05-04 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 145207.

https://reviews.llvm.org/D46431

Files:
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/pconfigintrin.h
  lib/Headers/x86intrin.h
  test/Driver/x86-target-features.c
  test/Headers/pconfigintin.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1204,6 +1204,7 @@
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
 // CHECK_ICX_M32: #define __PCLMUL__ 1
+// CHECK_ICX_M32: #define __PCONFIG__ 1
 // CHECK_ICX_M32: #define __PKU__ 1
 // CHECK_ICX_M32: #define __POPCNT__ 1
 // CHECK_ICX_M32: #define __PRFCHW__ 1
@@ -1261,6 +1262,7 @@
 // CHECK_ICX_M64: #define __MMX__ 1
 // CHECK_ICX_M64: #define __MPX__ 1
 // CHECK_ICX_M64: #define __PCLMUL__ 1
+// CHECK_ICX_M64: #define __PCONFIG__ 1
 // CHECK_ICX_M64: #define __PKU__ 1
 // CHECK_ICX_M64: #define __POPCNT__ 1
 // CHECK_ICX_M64: #define __PRFCHW__ 1
Index: test/Headers/pconfigintin.c
===
--- /dev/null
+++ test/Headers/pconfigintin.c
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -target-feature +pconfig -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386 -target-feature +pconfig -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-32
+
+#include 
+#include 
+#include 
+
+uint32_t test_pconfig(uint32_t leaf, size_t data[3]) {
+// CHECK-64: call { i32, i64, i64, i64 } asm "pconfig", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i64 %{{.*}}, i64 %{{.*}}, i64 %{{.*}})
+// CHECK-32: call { i32, i32, i32, i32 } asm "pconfig", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _pconfig_u32(leaf, data);
+}
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -159,3 +159,8 @@
 // RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
 // MOVDIR64B: "-target-feature" "+movdir64b"
 // NO-MOVDIR64B: "-target-feature" "-movdir64b"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mpconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=PCONFIG %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-pconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PCONFIG %s
+// PCONFIG: "-target-feature" "+pconfig"
+// NO-PCONFIG: "-target-feature" "-pconfig"
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -105,4 +105,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__PCONFIG__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/pconfigintrin.h
===
--- /dev/null
+++ lib/Headers/pconfigintrin.h
@@ -0,0 +1,50 @@
+/*=== pconfigintrin.h - X86 platform configuration -===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef __PCONFIGINTRIN_H
+#define __PCONFIGINTRIN_H
+
+#define __PCONFIG_KEY_PROGRAM 0x0001
+
+/* Define the default attributes for the functions in this file. */
+#define __DEFAULT_FN_ATTRS \

[PATCH] D46435: [x86] Introduce the encl[u|s|v] intrinsics

2018-05-04 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added reviewers: craig.topper, zvi.
Herald added subscribers: cfe-commits, mgorny.

Repository:
  rC Clang

https://reviews.llvm.org/D46435

Files:
  lib/Headers/CMakeLists.txt
  lib/Headers/module.modulemap
  lib/Headers/sgxintrin.h
  lib/Headers/x86intrin.h
  test/Headers/sgxintrin.c

Index: test/Headers/sgxintrin.c
===
--- /dev/null
+++ test/Headers/sgxintrin.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -target-feature +sgx -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386 -target-feature +sgx -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-32
+
+#include 
+#include 
+#include 
+
+uint32_t test_encls(uint32_t leaf, size_t data[3]) {
+// CHECK-64: call { i32, i64, i64, i64 } asm "encls", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i64 %{{.*}}, i64 %{{.*}}, i64 %{{.*}})
+// CHECK-32: call { i32, i32, i32, i32 } asm "encls", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+
+  return _encls_u32(leaf, data);
+}
+
+uint32_t test_enclu(uint32_t leaf, size_t data[3]) {
+// CHECK-64: call { i32, i64, i64, i64 } asm "enclu", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i64 %{{.*}}, i64 %{{.*}}, i64 %{{.*}})
+// CHECK-32: call { i32, i32, i32, i32 } asm "enclu", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+
+  return _enclu_u32(leaf, data);
+}
+
+uint32_t test_enclv(uint32_t leaf, size_t data[3]) {
+// CHECK-64: call { i32, i64, i64, i64 } asm "enclv", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i64 %{{.*}}, i64 %{{.*}}, i64 %{{.*}})
+// CHECK-32: call { i32, i32, i32, i32 } asm "enclv", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+
+  return _enclv_u32(leaf, data);
+}
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -105,4 +105,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__SGX__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/sgxintrin.h
===
--- /dev/null
+++ lib/Headers/sgxintrin.h
@@ -0,0 +1,70 @@
+/*=== sgxintrin.h - X86 SGX intrinsics configuration ---===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef __SGXINTRIN_H
+#define __SGXINTRIN_H
+
+/* Define the default attributes for the functions in this file. */
+#define __DEFAULT_FN_ATTRS \
+  __attribute__((__always_inline__, __nodebug__,  __target__("sgx")))
+
+static __inline unsigned int __DEFAULT_FN_ATTRS
+_enclu_u32(unsigned int __leaf, __SIZE_TYPE__ __d[])
+{
+  unsigned int __result;
+  __asm__ ("enclu"
+   : "=a" (__result), "=b" (__d[0]), "=c" (__d[1]), "=d" (__d[2])
+   : "a" (__leaf), "b" (__d[0]), "c" (__d[1]), "d" (__d[2])
+   : "cc");
+  return __result;
+}
+
+static __inline unsigned int __DEFAULT_FN_ATTRS
+_encls_u32(unsigned int __leaf, __SIZE_TYPE__ __d[])
+{
+  unsigned int __result;
+  __asm__ ("encls"
+   : "=a" (__result), "=b" (__d[0]), "=c" (__d[1]), "=d" (__d[2])
+   : "a" (__leaf), "b" (__d[0]), "c" (__d[1]), "d" (__d[2])
+   : "cc");
+  return __result;
+}
+
+static __inline unsigned int __DEFAULT_FN_ATTRS
+_enclv_u32(unsigned int

[PATCH] D44387: [x86] Introduce the pconfig/encl[u|s|v] intrinsics

2018-05-04 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

Here is a variation on this, using inline asm:
https://reviews.llvm.org/D46431


https://reviews.llvm.org/D44387



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46431: [x86] Introduce the pconfig intrinsic

2018-05-04 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added reviewers: craig.topper, zvi.
Herald added subscribers: cfe-commits, mgorny.

Repository:
  rC Clang

https://reviews.llvm.org/D46431

Files:
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/pconfigintrin.h
  lib/Headers/x86intrin.h
  test/Driver/x86-target-features.c
  test/Headers/pconfigintin.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1204,6 +1204,7 @@
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
 // CHECK_ICX_M32: #define __PCLMUL__ 1
+// CHECK_ICX_M32: #define __PCONFIG__ 1
 // CHECK_ICX_M32: #define __PKU__ 1
 // CHECK_ICX_M32: #define __POPCNT__ 1
 // CHECK_ICX_M32: #define __PRFCHW__ 1
@@ -1261,6 +1262,7 @@
 // CHECK_ICX_M64: #define __MMX__ 1
 // CHECK_ICX_M64: #define __MPX__ 1
 // CHECK_ICX_M64: #define __PCLMUL__ 1
+// CHECK_ICX_M64: #define __PCONFIG__ 1
 // CHECK_ICX_M64: #define __PKU__ 1
 // CHECK_ICX_M64: #define __POPCNT__ 1
 // CHECK_ICX_M64: #define __PRFCHW__ 1
Index: test/Headers/pconfigintin.c
===
--- /dev/null
+++ test/Headers/pconfigintin.c
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -target-feature +pconfig -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386 -target-feature +pconfig -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-32
+
+#include 
+#include 
+#include 
+
+uint32_t test_pconfig(uint32_t leaf, size_t data[3]) {
+// CHECK-64: call { i32, i64, i64, i64 } asm "pconfig", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i64 %{{.*}}, i64 %{{.*}}, i64 %{{.*}})
+// CHECK-32: call { i32, i32, i32, i32 } asm "pconfig", "={ax},={bx},={cx},={dx},{ax},{bx},{cx},{dx},~{cc},~{dirflag},~{fpsr},~{flags}"(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _pconfig_u32(leaf, data);
+}
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -159,3 +159,8 @@
 // RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
 // MOVDIR64B: "-target-feature" "+movdir64b"
 // NO-MOVDIR64B: "-target-feature" "-movdir64b"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mpconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=PCONFIG %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-pconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PCONFIG %s
+// PCONFIG: "-target-feature" "+pconfig"
+// NO-PCONFIG: "-target-feature" "-pconfig"
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -105,4 +105,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__PCONFIG__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/pconfigintrin.h
===
--- /dev/null
+++ lib/Headers/pconfigintrin.h
@@ -0,0 +1,50 @@
+/*=== pconfigintrin.h - X86 platform configuration -===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef __PCONFIGINTRIN_H
+#define __PCONFIGINTRIN_H
+
+#define __MKTME_KEY_PROGRAM 0x0001
+
+

[PATCH] D45984: [X86] directstore and movdir64b intrinsics

2018-05-01 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL331249: [X86] directstore and movdir64b intrinsics (authored 
by GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D45984?vs=143718&id=144686#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D45984

Files:
  cfe/trunk/docs/ClangCommandLineReference.rst
  cfe/trunk/include/clang/Basic/BuiltinsX86.def
  cfe/trunk/include/clang/Basic/BuiltinsX86_64.def
  cfe/trunk/include/clang/Driver/Options.td
  cfe/trunk/lib/Basic/Targets/X86.cpp
  cfe/trunk/lib/Basic/Targets/X86.h
  cfe/trunk/lib/Headers/CMakeLists.txt
  cfe/trunk/lib/Headers/cpuid.h
  cfe/trunk/lib/Headers/module.modulemap
  cfe/trunk/lib/Headers/movdirintrin.h
  cfe/trunk/lib/Headers/x86intrin.h
  cfe/trunk/test/CodeGen/builtin-movdir.c
  cfe/trunk/test/Driver/x86-target-features.c
  cfe/trunk/test/Preprocessor/predefined-arch-macros.c

Index: cfe/trunk/test/CodeGen/builtin-movdir.c
===
--- cfe/trunk/test/CodeGen/builtin-movdir.c
+++ cfe/trunk/test/CodeGen/builtin-movdir.c
@@ -0,0 +1,31 @@
+// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK
+
+#include 
+#include 
+
+void test_directstore32(void *dst, uint32_t value) {
+  // CHECK-LABEL: test_directstore32
+  // CHECK: call void @llvm.x86.directstore32
+  _directstoreu_u32(dst, value);
+}
+
+#ifdef __x86_64__
+
+void test_directstore64(void *dst, uint64_t value) {
+  // X86_64-LABEL: test_directstore64
+  // X86_64: call void @llvm.x86.directstore64
+  _directstoreu_u64(dst, value);
+}
+
+#endif
+
+void test_dir64b(void *dst, const void *src) {
+  // CHECK-LABEL: test_dir64b
+  // CHECK: call void @llvm.x86.movdir64b
+  _movdir64b(dst, src);
+}
+
+// CHECK: declare void @llvm.x86.directstore32(i8*, i32)
+// X86_64: declare void @llvm.x86.directstore64(i8*, i64)
+// CHECK: declare void @llvm.x86.movdir64b(i8*, i8*)
Index: cfe/trunk/test/Preprocessor/predefined-arch-macros.c
===
--- cfe/trunk/test/Preprocessor/predefined-arch-macros.c
+++ cfe/trunk/test/Preprocessor/predefined-arch-macros.c
@@ -1466,6 +1466,8 @@
 // CHECK_TRM_M32: #define __FXSR__ 1
 // CHECK_TRM_M32: #define __GFNI__ 1
 // CHECK_TRM_M32: #define __MMX__ 1
+// CHECK_TRM_M32: #define __MOVDIR64B__ 1
+// CHECK_TRM_M32: #define __MOVDIRI__ 1
 // CHECK_TRM_M32: #define __MPX__ 1
 // CHECK_TRM_M32: #define __PCLMUL__ 1
 // CHECK_TRM_M32: #define __POPCNT__ 1
@@ -1504,6 +1506,8 @@
 // CHECK_TRM_M64: #define __FXSR__ 1
 // CHECK_TRM_M64: #define __GFNI__ 1
 // CHECK_TRM_M64: #define __MMX__ 1
+// CHECK_TRM_M64: #define __MOVDIR64B__ 1
+// CHECK_TRM_M64: #define __MOVDIRI__ 1
 // CHECK_TRM_M64: #define __MPX__ 1
 // CHECK_TRM_M64: #define __PCLMUL__ 1
 // CHECK_TRM_M64: #define __POPCNT__ 1
Index: cfe/trunk/test/Driver/x86-target-features.c
===
--- cfe/trunk/test/Driver/x86-target-features.c
+++ cfe/trunk/test/Driver/x86-target-features.c
@@ -149,3 +149,13 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIRI %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIRI %s
+// MOVDIRI: "-target-feature" "+movdiri"
+// NO-MOVDIRI: "-target-feature" "-movdiri"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIR64B %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
+// MOVDIR64B: "-target-feature" "+movdir64b"
+// NO-MOVDIR64B: "-target-feature" "-movdir64b"
Index: cfe/trunk/lib/Headers/module.modulemap
===
--- cfe/trunk/lib/Headers/module.modulemap
+++ cfe/trunk/lib/Headers/module.modulemap
@@ -66,6 +66,7 @@
 textual header "wbnoinvdintrin.h"
 textual header "cldemoteintrin.h"
 textual header "waitpkgintrin.h"
+textual header "movdirintrin.h"
 
 explicit module mm_malloc {
   requires !freestanding
Index: cfe/trunk/lib/Headers/CMakeLists.txt
===
--- cfe/trunk/lib/Headers/CMakeLists.txt
+++ cfe/trunk/lib/Heade

[PATCH] D44387: [x86] Introduce the pconfig/encl[u|s|v] intrinsics

2018-04-26 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 144113.

https://reviews.llvm.org/D44387

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/CodeGen/CGBuiltin.cpp
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/pconfigintrin.h
  lib/Headers/sgxintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtins-x86.c
  test/CodeGen/pconfig.c
  test/CodeGen/sgx.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1204,6 +1204,7 @@
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
 // CHECK_ICX_M32: #define __PCLMUL__ 1
+// CHECK_ICX_M32: #define __PCONFIG__ 1
 // CHECK_ICX_M32: #define __PKU__ 1
 // CHECK_ICX_M32: #define __POPCNT__ 1
 // CHECK_ICX_M32: #define __PRFCHW__ 1
@@ -1261,6 +1262,7 @@
 // CHECK_ICX_M64: #define __MMX__ 1
 // CHECK_ICX_M64: #define __MPX__ 1
 // CHECK_ICX_M64: #define __PCLMUL__ 1
+// CHECK_ICX_M64: #define __PCONFIG__ 1
 // CHECK_ICX_M64: #define __PKU__ 1
 // CHECK_ICX_M64: #define __POPCNT__ 1
 // CHECK_ICX_M64: #define __PRFCHW__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,8 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mpconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=PCONFIG %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-pconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PCONFIG %s
+// PCONFIG: "-target-feature" "+pconfig"
+// NO-PCONFIG: "-target-feature" "-pconfig"
Index: test/CodeGen/sgx.c
===
--- /dev/null
+++ test/CodeGen/sgx.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -emit-llvm -target-feature +sgx -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386-unknown-unknown -emit-llvm -target-feature +sgx -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-32
+
+#include 
+
+#include 
+
+unsigned int test_encls(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_encls
+  //CHECK-64: @llvm.x86.encls.64
+  //CHECK-32: @llvm.x86.encls.32
+  return _encls_u32(leaf, arguments);
+}
+
+unsigned int test_enclu(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_enclu
+  //CHECK-64: @llvm.x86.enclu.64
+  //CHECK-32: @llvm.x86.enclu.32
+  return _enclu_u32(leaf, arguments);
+}
+
+unsigned int test_enclv(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_enclv
+  //CHECK-64: @llvm.x86.enclv.64
+  //CHECK-32: @llvm.x86.enclv.32
+  return _enclv_u32(leaf, arguments);
+}
Index: test/CodeGen/pconfig.c
===
--- /dev/null
+++ test/CodeGen/pconfig.c
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-apple-darwin -emit-llvm -target-feature +pconfig -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386 -emit-llvm -target-feature +pconfig -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-32
+
+#include 
+
+#include 
+
+unsigned int test_pconfig(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_pconfig
+  //CHECK-64: @llvm.x86.pconfig.64
+  //CHECK-32: @llvm.x86.pconfig.32
+  return _pconfig_u32(leaf, arguments);
+}
+
Index: test/CodeGen/builtins-x86.c
===
--- test/CodeGen/builtins-x86.c
+++ test/CodeGen/builtins-x86.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -target-feature +wbnoinvd -target-feature +cldemote -emit-llvm -o %t %s
-// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -target-feature +wbnoinvd -target-feature +cldemote -fsyntax-only -o %t %s
+// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-fea

[PATCH] D44387: [x86] Introduce the pconfig/encl[u|s|v] intrinsics

2018-04-26 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 144112.

https://reviews.llvm.org/D44387

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/CodeGen/CGBuiltin.cpp
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/pconfigintrin.h
  lib/Headers/sgxintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtins-x86.c
  test/CodeGen/pconfig.c
  test/CodeGen/sgx.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1204,6 +1204,7 @@
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
 // CHECK_ICX_M32: #define __PCLMUL__ 1
+// CHECK_ICX_M32: #define __PCONFIG__ 1
 // CHECK_ICX_M32: #define __PKU__ 1
 // CHECK_ICX_M32: #define __POPCNT__ 1
 // CHECK_ICX_M32: #define __PRFCHW__ 1
@@ -1261,6 +1262,7 @@
 // CHECK_ICX_M64: #define __MMX__ 1
 // CHECK_ICX_M64: #define __MPX__ 1
 // CHECK_ICX_M64: #define __PCLMUL__ 1
+// CHECK_ICX_M64: #define __PCONFIG__ 1
 // CHECK_ICX_M64: #define __PKU__ 1
 // CHECK_ICX_M64: #define __POPCNT__ 1
 // CHECK_ICX_M64: #define __PRFCHW__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,8 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mpconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=PCONFIG %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-pconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PCONFIG %s
+// PCONFIG: "-target-feature" "+pconfig"
+// NO-PCONFIG: "-target-feature" "-pconfig"
Index: test/CodeGen/sgx.c
===
--- /dev/null
+++ test/CodeGen/sgx.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -emit-llvm -target-feature +sgx -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386-unknown-unknown -emit-llvm -target-feature +sgx -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-32
+
+#include 
+
+#include 
+
+unsigned int test_encls(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_encls
+  //CHECK-64: @llvm.x86.encls.64
+  //CHECK-32: @llvm.x86.encls.32
+  return _encls_u32(leaf, arguments);
+}
+
+unsigned int test_enclu(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_enclu
+  //CHECK-64: @llvm.x86.enclu.64
+  //CHECK-32: @llvm.x86.enclu.32
+  return _enclu_u32(leaf, arguments);
+}
+
+unsigned int test_enclv(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_enclv
+  //CHECK-64: @llvm.x86.enclv.64
+  //CHECK-32: @llvm.x86.enclv.32
+  return _enclv_u32(leaf, arguments);
+}
Index: test/CodeGen/pconfig.c
===
--- /dev/null
+++ test/CodeGen/pconfig.c
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-apple-darwin -emit-llvm -target-feature +pconfig -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386 -emit-llvm -target-feature +pconfig -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-32
+
+#include 
+
+#include 
+
+unsigned int test_pconfig(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_pconfig
+  //CHECK-64: @llvm.x86.pconfig.64
+  //CHECK-32: @llvm.x86.pconfig.32
+  return _pconfig_u32(leaf, arguments);
+}
+
Index: test/CodeGen/builtins-x86.c
===
--- test/CodeGen/builtins-x86.c
+++ test/CodeGen/builtins-x86.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -target-feature +wbnoinvd -target-feature +cldemote -emit-llvm -o %t %s
-// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -target-feature +wbnoinvd -target-feature +cldemote -fsyntax-only -o %t %s
+// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-fea

[PATCH] D45984: [X86] directstore and movdir64b intrinsics

2018-04-24 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 143718.

https://reviews.llvm.org/D45984

Files:
  docs/ClangCommandLineReference.rst
  include/clang/Basic/BuiltinsX86.def
  include/clang/Basic/BuiltinsX86_64.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/movdirintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtin-movdir.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1466,6 +1466,8 @@
 // CHECK_TRM_M32: #define __FXSR__ 1
 // CHECK_TRM_M32: #define __GFNI__ 1
 // CHECK_TRM_M32: #define __MMX__ 1
+// CHECK_TRM_M32: #define __MOVDIR64B__ 1
+// CHECK_TRM_M32: #define __MOVDIRI__ 1
 // CHECK_TRM_M32: #define __MPX__ 1
 // CHECK_TRM_M32: #define __PCLMUL__ 1
 // CHECK_TRM_M32: #define __POPCNT__ 1
@@ -1504,6 +1506,8 @@
 // CHECK_TRM_M64: #define __FXSR__ 1
 // CHECK_TRM_M64: #define __GFNI__ 1
 // CHECK_TRM_M64: #define __MMX__ 1
+// CHECK_TRM_M64: #define __MOVDIR64B__ 1
+// CHECK_TRM_M64: #define __MOVDIRI__ 1
 // CHECK_TRM_M64: #define __MPX__ 1
 // CHECK_TRM_M64: #define __PCLMUL__ 1
 // CHECK_TRM_M64: #define __POPCNT__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,13 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIRI %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIRI %s
+// MOVDIRI: "-target-feature" "+movdiri"
+// NO-MOVDIRI: "-target-feature" "-movdiri"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIR64B %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
+// MOVDIR64B: "-target-feature" "+movdir64b"
+// NO-MOVDIR64B: "-target-feature" "-movdir64b"
Index: test/CodeGen/builtin-movdir.c
===
--- /dev/null
+++ test/CodeGen/builtin-movdir.c
@@ -0,0 +1,31 @@
+// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -Wall -pedantic -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK
+
+#include 
+#include 
+
+void test_directstore32(void *dst, uint32_t value) {
+  // CHECK-LABEL: test_directstore32
+  // CHECK: call void @llvm.x86.directstore32
+  _directstoreu_u32(dst, value);
+}
+
+#ifdef __x86_64__
+
+void test_directstore64(void *dst, uint64_t value) {
+  // X86_64-LABEL: test_directstore64
+  // X86_64: call void @llvm.x86.directstore64
+  _directstoreu_u64(dst, value);
+}
+
+#endif
+
+void test_dir64b(void *dst, const void *src) {
+  // CHECK-LABEL: test_dir64b
+  // CHECK: call void @llvm.x86.movdir64b
+  _movdir64b(dst, src);
+}
+
+// CHECK: declare void @llvm.x86.directstore32(i8*, i32)
+// X86_64: declare void @llvm.x86.directstore64(i8*, i64)
+// CHECK: declare void @llvm.x86.movdir64b(i8*, i8*)
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -100,4 +100,9 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || \
+  defined(__MOVDIRI__) || defined(__MOVDIR64B__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/movdirintrin.h
===
--- /dev/null
+++ lib/Headers/movdirintrin.h
@@ -0,0 +1,58 @@
+/*===- movdirintrin.h --===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions

[PATCH] D45984: [X86] directstore and movdir64b intrinsics

2018-04-23 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 143647.

https://reviews.llvm.org/D45984

Files:
  docs/ClangCommandLineReference.rst
  include/clang/Basic/BuiltinsX86.def
  include/clang/Basic/BuiltinsX86_64.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/movdirintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtin-movdir.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1466,6 +1466,8 @@
 // CHECK_TRM_M32: #define __FXSR__ 1
 // CHECK_TRM_M32: #define __GFNI__ 1
 // CHECK_TRM_M32: #define __MMX__ 1
+// CHECK_TRM_M32: #define __MOVDIR64B__ 1
+// CHECK_TRM_M32: #define __MOVDIRI__ 1
 // CHECK_TRM_M32: #define __MPX__ 1
 // CHECK_TRM_M32: #define __PCLMUL__ 1
 // CHECK_TRM_M32: #define __POPCNT__ 1
@@ -1504,6 +1506,8 @@
 // CHECK_TRM_M64: #define __FXSR__ 1
 // CHECK_TRM_M64: #define __GFNI__ 1
 // CHECK_TRM_M64: #define __MMX__ 1
+// CHECK_TRM_M64: #define __MOVDIR64B__ 1
+// CHECK_TRM_M64: #define __MOVDIRI__ 1
 // CHECK_TRM_M64: #define __MPX__ 1
 // CHECK_TRM_M64: #define __PCLMUL__ 1
 // CHECK_TRM_M64: #define __POPCNT__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,13 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIRI %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIRI %s
+// MOVDIRI: "-target-feature" "+movdiri"
+// NO-MOVDIRI: "-target-feature" "-movdiri"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIR64B %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
+// MOVDIR64B: "-target-feature" "+movdir64b"
+// NO-MOVDIR64B: "-target-feature" "-movdir64b"
Index: test/CodeGen/builtin-movdir.c
===
--- /dev/null
+++ test/CodeGen/builtin-movdir.c
@@ -0,0 +1,31 @@
+// RUN: %clang_cc1 -ffreestanding -triple x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK
+
+#include 
+#include 
+
+void test_directstore32(void *dst, uint32_t value) {
+  // CHECK-LABEL: test_directstore32
+  // CHECK: call void @llvm.x86.directstore32
+  _directstoreu_u32(dst, value);
+}
+
+#ifdef __x86_64__
+
+void test_directstore64(void *dst, uint64_t value) {
+  // X86_64-LABEL: test_directstore64
+  // X86_64: call void @llvm.x86.directstore64
+  _directstoreu_u64(dst, value);
+}
+
+#endif
+
+void test_dir64b(void *dst, const void *src) {
+  // CHECK-LABEL: test_dir64b
+  // CHECK: call void @llvm.x86.movdir64b
+  _movdir64b(dst, src);
+}
+
+// CHECK: declare void @llvm.x86.directstore32(i8*, i32)
+// X86_64: declare void @llvm.x86.directstore64(i8*, i64)
+// CHECK: declare void @llvm.x86.movdir64b(i8*, i8*)
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -100,4 +100,9 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || \
+  defined(__MOVDIRI__) || defined(__MOVDIR64B__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/movdirintrin.h
===
--- /dev/null
+++ lib/Headers/movdirintrin.h
@@ -0,0 +1,54 @@
+/*===- movdirintrin.h --===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOF

[PATCH] D45984: [X86] directstore and movdir64b intrinsics

2018-04-23 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 143644.

https://reviews.llvm.org/D45984

Files:
  docs/ClangCommandLineReference.rst
  include/clang/Basic/BuiltinsX86.def
  include/clang/Basic/BuiltinsX86_64.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/movdirintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtin-movdir.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1466,6 +1466,8 @@
 // CHECK_TRM_M32: #define __FXSR__ 1
 // CHECK_TRM_M32: #define __GFNI__ 1
 // CHECK_TRM_M32: #define __MMX__ 1
+// CHECK_TRM_M32: #define __MOVDIR64B__ 1
+// CHECK_TRM_M32: #define __MOVDIRI__ 1
 // CHECK_TRM_M32: #define __MPX__ 1
 // CHECK_TRM_M32: #define __PCLMUL__ 1
 // CHECK_TRM_M32: #define __POPCNT__ 1
@@ -1504,6 +1506,8 @@
 // CHECK_TRM_M64: #define __FXSR__ 1
 // CHECK_TRM_M64: #define __GFNI__ 1
 // CHECK_TRM_M64: #define __MMX__ 1
+// CHECK_TRM_M64: #define __MOVDIR64B__ 1
+// CHECK_TRM_M64: #define __MOVDIRI__ 1
 // CHECK_TRM_M64: #define __MPX__ 1
 // CHECK_TRM_M64: #define __PCLMUL__ 1
 // CHECK_TRM_M64: #define __POPCNT__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,13 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIRI %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIRI %s
+// MOVDIRI: "-target-feature" "+movdiri"
+// NO-MOVDIRI: "-target-feature" "-movdiri"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIR64B %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
+// MOVDIR64B: "-target-feature" "+movdir64b"
+// NO-MOVDIR64B: "-target-feature" "-movdir64b"
Index: test/CodeGen/builtin-movdir.c
===
--- /dev/null
+++ test/CodeGen/builtin-movdir.c
@@ -0,0 +1,31 @@
+// RUN: %clang_cc1 -ffreestanding -triple x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK
+
+#include 
+#include 
+
+void test_directstore32(void *dst, uint32_t value) {
+  // CHECK-LABEL: test_directstore32
+  // CHECK: call void @llvm.x86.directstore32
+  _directstoreu_u32(dst, value);
+}
+
+#ifdef __x86_64__
+
+void test_directstore64(void *dst, uint64_t value) {
+  // X86_64-LABEL: test_directstore64
+  // X86_64: call void @llvm.x86.directstore64
+  _directstoreu_u64(dst, value);
+}
+
+#endif
+
+void test_dir64b(void *dst, const void *src) {
+  // CHECK-LABEL: test_dir64b
+  // CHECK: call void @llvm.x86.movdir64b
+  _movdir64b(dst, src);
+}
+
+// CHECK: declare void @llvm.x86.directstore32(i8*, i32)
+// X86_64: declare void @llvm.x86.directstore64(i8*, i64)
+// CHECK: declare void @llvm.x86.movdir64b(i8*, i8*)
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -100,4 +100,9 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || \
+  defined(__MOVDIRI__) || defined(__MOVDIR64B__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/movdirintrin.h
===
--- /dev/null
+++ lib/Headers/movdirintrin.h
@@ -0,0 +1,54 @@
+/*===- movdirintrin.h --===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOF

[PATCH] D45984: [X86] directstore and movdir64b intrinsics

2018-04-23 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 143642.

https://reviews.llvm.org/D45984

Files:
  docs/ClangCommandLineReference.rst
  include/clang/Basic/BuiltinsX86.def
  include/clang/Basic/BuiltinsX86_64.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/movdirintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtin-movdir.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1466,6 +1466,8 @@
 // CHECK_TRM_M32: #define __FXSR__ 1
 // CHECK_TRM_M32: #define __GFNI__ 1
 // CHECK_TRM_M32: #define __MMX__ 1
+// CHECK_TRM_M32: #define __MOVDIR64B__ 1
+// CHECK_TRM_M32: #define __MOVDIRI__ 1
 // CHECK_TRM_M32: #define __MPX__ 1
 // CHECK_TRM_M32: #define __PCLMUL__ 1
 // CHECK_TRM_M32: #define __POPCNT__ 1
@@ -1504,6 +1506,8 @@
 // CHECK_TRM_M64: #define __FXSR__ 1
 // CHECK_TRM_M64: #define __GFNI__ 1
 // CHECK_TRM_M64: #define __MMX__ 1
+// CHECK_TRM_M64: #define __MOVDIR64B__ 1
+// CHECK_TRM_M64: #define __MOVDIRI__ 1
 // CHECK_TRM_M64: #define __MPX__ 1
 // CHECK_TRM_M64: #define __PCLMUL__ 1
 // CHECK_TRM_M64: #define __POPCNT__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,13 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIRI %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIRI %s
+// MOVDIRI: "-target-feature" "+movdiri"
+// NO-MOVDIRI: "-target-feature" "-movdiri"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIR64B %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
+// MOVDIR64B: "-target-feature" "+movdir64b"
+// NO-MOVDIR64B: "-target-feature" "-movdir64b"
Index: test/CodeGen/builtin-movdir.c
===
--- /dev/null
+++ test/CodeGen/builtin-movdir.c
@@ -0,0 +1,31 @@
+// RUN: %clang_cc1 -ffreestanding -triple x86_64-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -triple i386-unknown-unknown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK
+
+#include 
+#include 
+
+void test_directstore32(void *dst, uint32_t value) {
+  // CHECK-LABEL: test_directstore32
+  // CHECK: call void @llvm.x86.directstore32
+  _directstoreu_u32(dst, value);
+}
+
+#ifdef __x86_64__
+
+void test_directstore64(void *dst, uint64_t value) {
+  // X86_64-LABEL: test_directstore64
+  // X86_64: call void @llvm.x86.directstore64
+  _directstoreu_u64(dst, value);
+}
+
+#endif
+
+void test_dir64b(void *dst, const void *src) {
+  // CHECK-LABEL: test_dir64b
+  // CHECK: call void @llvm.x86.movdir64b
+  _movdir64b(dst, src);
+}
+
+// CHECK: declare void @llvm.x86.directstore32(i8*, i32)
+// X86_64: declare void @llvm.x86.directstore64(i8*, i64)
+// CHECK: declare void @llvm.x86.movdir64b(i8*, i8*)
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -100,4 +100,9 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || \
+  defined(__MOVDIRI__) || defined(__MOVDIR64B__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/movdirintrin.h
===
--- /dev/null
+++ lib/Headers/movdirintrin.h
@@ -0,0 +1,54 @@
+/*===- movdirintrin.h --===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOF

[PATCH] D44387: [x86] Introduce the pconfig/encl[u|s|v] intrinsics

2018-04-23 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 143640.

https://reviews.llvm.org/D44387

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/CodeGen/CGBuiltin.cpp
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/pconfigintrin.h
  lib/Headers/sgxintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtins-x86.c
  test/CodeGen/pconfig.c
  test/CodeGen/sgx.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1204,6 +1204,7 @@
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
 // CHECK_ICX_M32: #define __PCLMUL__ 1
+// CHECK_ICX_M32: #define __PCONFIG__ 1
 // CHECK_ICX_M32: #define __PKU__ 1
 // CHECK_ICX_M32: #define __POPCNT__ 1
 // CHECK_ICX_M32: #define __PRFCHW__ 1
@@ -1261,6 +1262,7 @@
 // CHECK_ICX_M64: #define __MMX__ 1
 // CHECK_ICX_M64: #define __MPX__ 1
 // CHECK_ICX_M64: #define __PCLMUL__ 1
+// CHECK_ICX_M64: #define __PCONFIG__ 1
 // CHECK_ICX_M64: #define __PKU__ 1
 // CHECK_ICX_M64: #define __POPCNT__ 1
 // CHECK_ICX_M64: #define __PRFCHW__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,8 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mpconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=PCONFIG %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-pconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PCONFIG %s
+// PCONFIG: "-target-feature" "+pconfig"
+// NO-PCONFIG: "-target-feature" "-pconfig"
Index: test/CodeGen/sgx.c
===
--- /dev/null
+++ test/CodeGen/sgx.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -emit-llvm -target-feature +sgx -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386-unknown-unknown -emit-llvm -target-feature +sgx -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-32
+
+#include 
+
+#include 
+
+unsigned int test_encls(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_encls
+  //CHECK-64: @llvm.x86.encls.64
+  //CHECK-32: @llvm.x86.encls.32
+  return _encls_u32(leaf, arguments);
+}
+
+unsigned int test_enclu(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_enclu
+  //CHECK-64: @llvm.x86.enclu.64
+  //CHECK-32: @llvm.x86.enclu.32
+  return _enclu_u32(leaf, arguments);
+}
+
+unsigned int test_enclv(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_enclv
+  //CHECK-64: @llvm.x86.enclv.64
+  //CHECK-32: @llvm.x86.enclv.32
+  return _enclv_u32(leaf, arguments);
+}
Index: test/CodeGen/pconfig.c
===
--- /dev/null
+++ test/CodeGen/pconfig.c
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-apple-darwin -emit-llvm -target-feature +pconfig -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386 -emit-llvm -target-feature +pconfig -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-32
+
+#include 
+
+#include 
+
+unsigned int test_pconfig(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_pconfig
+  //CHECK-64: @llvm.x86.pconfig.64
+  //CHECK-32: @llvm.x86.pconfig.32
+  return _pconfig_u32(leaf, arguments);
+}
+
Index: test/CodeGen/builtins-x86.c
===
--- test/CodeGen/builtins-x86.c
+++ test/CodeGen/builtins-x86.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -target-feature +wbnoinvd -target-feature +cldemote -emit-llvm -o %t %s
-// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -target-feature +wbnoinvd -target-feature +cldemote -fsyntax-only -o %t %s
+// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-fea

[PATCH] D44387: [x86] Introduce the pconfig/encl[u|s|v] intrinsics

2018-04-23 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 143639.
GBuella added a comment.

Rebased the patch.
Added pconfig to Icelake Server.


https://reviews.llvm.org/D44387

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/CodeGen/CGBuiltin.cpp
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/module.modulemap
  lib/Headers/pconfigintrin.h
  lib/Headers/sgxintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtins-x86.c
  test/CodeGen/pconfig.c
  test/CodeGen/sgx.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1204,6 +1204,7 @@
 // CHECK_ICX_M32: #define __MMX__ 1
 // CHECK_ICX_M32: #define __MPX__ 1
 // CHECK_ICX_M32: #define __PCLMUL__ 1
+// CHECK_ICX_M32: #define __PCONFIG__ 1
 // CHECK_ICX_M32: #define __PKU__ 1
 // CHECK_ICX_M32: #define __POPCNT__ 1
 // CHECK_ICX_M32: #define __PRFCHW__ 1
@@ -1261,6 +1262,7 @@
 // CHECK_ICX_M64: #define __MMX__ 1
 // CHECK_ICX_M64: #define __MPX__ 1
 // CHECK_ICX_M64: #define __PCLMUL__ 1
+// CHECK_ICX_M64: #define __PCONFIG__ 1
 // CHECK_ICX_M64: #define __PKU__ 1
 // CHECK_ICX_M64: #define __POPCNT__ 1
 // CHECK_ICX_M64: #define __PRFCHW__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,8 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mpconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=PCONFIG %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-pconfig %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-PCONFIG %s
+// PCONFIG: "-target-feature" "+pconfig"
+// NO-PCONFIG: "-target-feature" "-pconfig"
Index: test/CodeGen/sgx.c
===
--- /dev/null
+++ test/CodeGen/sgx.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -emit-llvm -target-feature +sgx -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386-unknown-unknown -emit-llvm -target-feature +sgx -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-32
+
+#include 
+
+#include 
+
+unsigned int test_encls(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_encls
+  //CHECK-64: @llvm.x86.encls.64
+  //CHECK-32: @llvm.x86.encls.32
+  return _encls_u32(leaf, arguments);
+}
+
+unsigned int test_enclu(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_enclu
+  //CHECK-64: @llvm.x86.enclu.64
+  //CHECK-32: @llvm.x86.enclu.32
+  return _enclu_u32(leaf, arguments);
+}
+
+unsigned int test_enclv(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_enclv
+  //CHECK-64: @llvm.x86.enclv.64
+  //CHECK-32: @llvm.x86.enclv.32
+  return _enclv_u32(leaf, arguments);
+}
Index: test/CodeGen/pconfig.c
===
--- /dev/null
+++ test/CodeGen/pconfig.c
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-apple-darwin -emit-llvm -target-feature +pconfig -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-64
+// RUN: %clang_cc1 %s -ffreestanding -triple i386 -emit-llvm -target-feature +pconfig -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-32
+
+#include 
+
+#include 
+
+unsigned int test_pconfig(unsigned int leaf, size_t arguments[]) {
+  //CHECK-LABEL: @test_pconfig
+  //CHECK-64: @llvm.x86.pconfig.64
+  //CHECK-32: @llvm.x86.pconfig.32
+  return _pconfig_u32(leaf, arguments);
+}
+
Index: test/CodeGen/builtins-x86.c
===
--- test/CodeGen/builtins-x86.c
+++ test/CodeGen/builtins-x86.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -target-feature +wbnoinvd -target-feature +cldemote -emit-llvm -o %t %s
-// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -target-feature +wbnoinvd -target-feature +cldemote -fsyntax-only -o %t %s
+// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -ta

[PATCH] D45984: [X86] directstore and movdir64b intrinsics

2018-04-23 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added a reviewer: craig.topper.
Herald added subscribers: cfe-commits, mgorny.

Repository:
  rC Clang

https://reviews.llvm.org/D45984

Files:
  docs/ClangCommandLineReference.rst
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/movdirintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtin-movdir.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1466,6 +1466,8 @@
 // CHECK_TRM_M32: #define __FXSR__ 1
 // CHECK_TRM_M32: #define __GFNI__ 1
 // CHECK_TRM_M32: #define __MMX__ 1
+// CHECK_TRM_M32: #define __MOVDIR64B__ 1
+// CHECK_TRM_M32: #define __MOVDIRI__ 1
 // CHECK_TRM_M32: #define __MPX__ 1
 // CHECK_TRM_M32: #define __PCLMUL__ 1
 // CHECK_TRM_M32: #define __POPCNT__ 1
@@ -1504,6 +1506,8 @@
 // CHECK_TRM_M64: #define __FXSR__ 1
 // CHECK_TRM_M64: #define __GFNI__ 1
 // CHECK_TRM_M64: #define __MMX__ 1
+// CHECK_TRM_M64: #define __MOVDIR64B__ 1
+// CHECK_TRM_M64: #define __MOVDIRI__ 1
 // CHECK_TRM_M64: #define __MPX__ 1
 // CHECK_TRM_M64: #define __PCLMUL__ 1
 // CHECK_TRM_M64: #define __POPCNT__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -149,3 +149,13 @@
 // RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
 // WAITPKG: "-target-feature" "+waitpkg"
 // NO-WAITPKG: "-target-feature" "-waitpkg"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIRI %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdiri %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIRI %s
+// MOVDIRI: "-target-feature" "+movdiri"
+// NO-MOVDIRI: "-target-feature" "-movdiri"
+
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVDIR64B %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movdir64b %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVDIR64B %s
+// MOVDIR64B: "-target-feature" "+movdir64b"
+// NO-MOVDIR64B: "-target-feature" "-movdir64b"
Index: test/CodeGen/builtin-movdir.c
===
--- /dev/null
+++ test/CodeGen/builtin-movdir.c
@@ -0,0 +1,31 @@
+// RUN: %clang_cc1 -ffreestanding -triple x86_64-unkown-unkown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=X86_64 --check-prefix=CHECK
+// RUN: %clang_cc1 -ffreestanding -triple i386-unkown-unkown -target-feature +movdiri -target-feature +movdir64b %s -emit-llvm -o - | FileCheck %s --check-prefix=CHECK
+
+#include 
+#include 
+
+void test_directstore32(void *dst, uint32_t value) {
+  // CHECK-LABEL: test_directstore32
+  // CHECK: call void @llvm.x86.directstore32
+  _directstoreu_u32(dst, value);
+}
+
+#ifdef __x86_64__
+
+void test_directstore64(void *dst, uint64_t value) {
+  // X86_64-LABEL: test_directstore64
+  // X86_64: call void @llvm.x86.directstore64
+  _directstoreu_u64(dst, value);
+}
+
+#endif
+
+void test_dir64b(void *dst, const void *src) {
+  // CHECK-LABEL: test_dir64b
+  // CHECK: call void @llvm.x86.movdir64b
+  _movdir64b(dst, src);
+}
+
+// CHECK: declare void @llvm.x86.directstore32(i8*, i32)
+// X86_64: declare void @llvm.x86.directstore64(i8*, i64)
+// CHECK: declare void @llvm.x86.movdir64b(i8*, i8*)
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -100,4 +100,9 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || \
+  defined(__MOVDIRI__) || defined(__MOVDIR64B__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/movdirintrin.h
===
--- /dev/null
+++ lib/Headers/movdirintrin.h
@@ -0,0 +1,66 @@
+/*===- movdirintrin.h --===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or su

[PATCH] D45254: [X86] WaitPKG intrinsics

2018-04-20 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL330463: [X86] WaitPKG intrinsics (authored by GBuella, 
committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D45254?vs=143249&id=143355#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D45254

Files:
  cfe/trunk/include/clang/Basic/BuiltinsX86.def
  cfe/trunk/include/clang/Driver/Options.td
  cfe/trunk/lib/Basic/Targets/X86.cpp
  cfe/trunk/lib/Basic/Targets/X86.h
  cfe/trunk/lib/Headers/CMakeLists.txt
  cfe/trunk/lib/Headers/cpuid.h
  cfe/trunk/lib/Headers/waitpkgintrin.h
  cfe/trunk/lib/Headers/x86intrin.h
  cfe/trunk/test/CodeGen/waitpkg.c
  cfe/trunk/test/Driver/x86-target-features.c
  cfe/trunk/test/Preprocessor/predefined-arch-macros.c

Index: cfe/trunk/test/CodeGen/waitpkg.c
===
--- cfe/trunk/test/CodeGen/waitpkg.c
+++ cfe/trunk/test/CodeGen/waitpkg.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -emit-llvm -target-feature +waitpkg -Wall -pedantic -o - | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple i386-unknown-unknown -emit-llvm -target-feature +waitpkg -Wall -pedantic -o - | FileCheck %s
+
+#include 
+
+#include 
+#include 
+
+void test_umonitor(void *address) {
+  //CHECK-LABEL: @test_umonitor
+  //CHECK: call void @llvm.x86.umonitor(i8* %{{.*}})
+  return _umonitor(address);
+}
+
+uint8_t test_umwait(uint32_t control, uint64_t counter) {
+  //CHECK-LABEL: @test_umwait
+  //CHECK: call i8 @llvm.x86.umwait(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _umwait(control, counter);
+}
+
+uint8_t test_tpause(uint32_t control, uint64_t counter) {
+  //CHECK-LABEL: @test_tpause
+  //CHECK: call i8 @llvm.x86.tpause(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _tpause(control, counter);
+}
Index: cfe/trunk/test/Preprocessor/predefined-arch-macros.c
===
--- cfe/trunk/test/Preprocessor/predefined-arch-macros.c
+++ cfe/trunk/test/Preprocessor/predefined-arch-macros.c
@@ -1482,6 +1482,7 @@
 // CHECK_TRM_M32: #define __SSE_MATH__ 1
 // CHECK_TRM_M32: #define __SSE__ 1
 // CHECK_TRM_M32: #define __SSSE3__ 1
+// CHECK_TRM_M32: #define __WAITPKG__ 1
 // CHECK_TRM_M32: #define __XSAVEC__ 1
 // CHECK_TRM_M32: #define __XSAVEOPT__ 1
 // CHECK_TRM_M32: #define __XSAVES__ 1
@@ -1518,6 +1519,7 @@
 // CHECK_TRM_M64: #define __SSE4_2__ 1
 // CHECK_TRM_M64: #define __SSE__ 1
 // CHECK_TRM_M64: #define __SSSE3__ 1
+// CHECK_TRM_M64: #define __WAITPKG__ 1
 // CHECK_TRM_M64: #define __XSAVEC__ 1
 // CHECK_TRM_M64: #define __XSAVEOPT__ 1
 // CHECK_TRM_M64: #define __XSAVES__ 1
Index: cfe/trunk/test/Driver/x86-target-features.c
===
--- cfe/trunk/test/Driver/x86-target-features.c
+++ cfe/trunk/test/Driver/x86-target-features.c
@@ -144,3 +144,8 @@
 // RUN: %clang -target i386-linux-gnu -mretpoline -mno-retpoline-external-thunk %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-RETPOLINE-EXTERNAL-THUNK %s
 // RETPOLINE-EXTERNAL-THUNK: "-target-feature" "+retpoline-external-thunk"
 // NO-RETPOLINE-EXTERNAL-THUNK: "-target-feature" "-retpoline-external-thunk"
+
+// RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=WAITPKG %s
+// RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
+// WAITPKG: "-target-feature" "+waitpkg"
+// NO-WAITPKG: "-target-feature" "-waitpkg"
Index: cfe/trunk/lib/Headers/CMakeLists.txt
===
--- cfe/trunk/lib/Headers/CMakeLists.txt
+++ cfe/trunk/lib/Headers/CMakeLists.txt
@@ -96,6 +96,7 @@
   varargs.h
   vecintrin.h
   vpclmulqdqintrin.h
+  waitpkgintrin.h
   wbnoinvdintrin.h
   wmmintrin.h
   __wmmintrin_aes.h
Index: cfe/trunk/lib/Headers/x86intrin.h
===
--- cfe/trunk/lib/Headers/x86intrin.h
+++ cfe/trunk/lib/Headers/x86intrin.h
@@ -96,4 +96,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__WAITPKG__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: cfe/trunk/lib/Headers/cpuid.h
===
--- cfe/trunk/lib/Headers/cpuid.h
+++ cfe/trunk/lib/Headers/cpuid.h
@@ -177,6 +177,7 @@
 #define bit_AVX512VBMI   0x0002
 #define bit_PKU  0x0004
 #define bit_OSPKE0x0010
+#define bit_WAITPKG  0x0020
 #define bit_AVX512VBMI2  0x0040
 #define bit_SHSTK0x0080
 #define bit_GFNI 0x0100
Index: cfe/trunk/lib/Headers/waitpkgintrin.h
===
--- cfe/trunk/lib/Headers/waitpkgintrin.h
+++ cfe/trunk/lib/Headers/waitpkgintrin.h
@@ -0,0 +1,56 @@
+/*===

[PATCH] D45254: [X86] WaitPKG intrinsics

2018-04-20 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 143249.
GBuella retitled this revision from "[X86][WAITPKG] WaitPKG intrinsics" to 
"[X86] WaitPKG intrinsics".

https://reviews.llvm.org/D45254

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/waitpkgintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/waitpkg.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1482,6 +1482,7 @@
 // CHECK_TRM_M32: #define __SSE_MATH__ 1
 // CHECK_TRM_M32: #define __SSE__ 1
 // CHECK_TRM_M32: #define __SSSE3__ 1
+// CHECK_TRM_M32: #define __WAITPKG__ 1
 // CHECK_TRM_M32: #define __XSAVEC__ 1
 // CHECK_TRM_M32: #define __XSAVEOPT__ 1
 // CHECK_TRM_M32: #define __XSAVES__ 1
@@ -1518,6 +1519,7 @@
 // CHECK_TRM_M64: #define __SSE4_2__ 1
 // CHECK_TRM_M64: #define __SSE__ 1
 // CHECK_TRM_M64: #define __SSSE3__ 1
+// CHECK_TRM_M64: #define __WAITPKG__ 1
 // CHECK_TRM_M64: #define __XSAVEC__ 1
 // CHECK_TRM_M64: #define __XSAVEOPT__ 1
 // CHECK_TRM_M64: #define __XSAVES__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -144,3 +144,8 @@
 // RUN: %clang -target i386-linux-gnu -mretpoline -mno-retpoline-external-thunk %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-RETPOLINE-EXTERNAL-THUNK %s
 // RETPOLINE-EXTERNAL-THUNK: "-target-feature" "+retpoline-external-thunk"
 // NO-RETPOLINE-EXTERNAL-THUNK: "-target-feature" "-retpoline-external-thunk"
+
+// RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=WAITPKG %s
+// RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
+// WAITPKG: "-target-feature" "+waitpkg"
+// NO-WAITPKG: "-target-feature" "-waitpkg"
Index: test/CodeGen/waitpkg.c
===
--- /dev/null
+++ test/CodeGen/waitpkg.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -emit-llvm -target-feature +waitpkg -Wall -pedantic -o - | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple i386-unknown-unknown -emit-llvm -target-feature +waitpkg -Wall -pedantic -o - | FileCheck %s
+
+#include 
+
+#include 
+#include 
+
+void test_umonitor(void *address) {
+  //CHECK-LABEL: @test_umonitor
+  //CHECK: call void @llvm.x86.umonitor(i8* %{{.*}})
+  return _umonitor(address);
+}
+
+uint8_t test_umwait(uint32_t control, uint64_t counter) {
+  //CHECK-LABEL: @test_umwait
+  //CHECK: call i8 @llvm.x86.umwait(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _umwait(control, counter);
+}
+
+uint8_t test_tpause(uint32_t control, uint64_t counter) {
+  //CHECK-LABEL: @test_tpause
+  //CHECK: call i8 @llvm.x86.tpause(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _tpause(control, counter);
+}
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -96,4 +96,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__WAITPKG__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/waitpkgintrin.h
===
--- /dev/null
+++ lib/Headers/waitpkgintrin.h
@@ -0,0 +1,56 @@
+/*===--- waitpkgintrin.h - WAITPKG ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+#ifndef __X86INTRIN_H
+#error

[PATCH] D45254: [X86][WAITPKG] WaitPKG intrinsics

2018-04-18 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 142913.

https://reviews.llvm.org/D45254

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/waitpkgintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/waitpkg.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1482,6 +1482,7 @@
 // CHECK_TRM_M32: #define __SSE_MATH__ 1
 // CHECK_TRM_M32: #define __SSE__ 1
 // CHECK_TRM_M32: #define __SSSE3__ 1
+// CHECK_TRM_M32: #define __WAITPKG__ 1
 // CHECK_TRM_M32: #define __XSAVEC__ 1
 // CHECK_TRM_M32: #define __XSAVEOPT__ 1
 // CHECK_TRM_M32: #define __XSAVES__ 1
@@ -1518,6 +1519,7 @@
 // CHECK_TRM_M64: #define __SSE4_2__ 1
 // CHECK_TRM_M64: #define __SSE__ 1
 // CHECK_TRM_M64: #define __SSSE3__ 1
+// CHECK_TRM_M64: #define __WAITPKG__ 1
 // CHECK_TRM_M64: #define __XSAVEC__ 1
 // CHECK_TRM_M64: #define __XSAVEOPT__ 1
 // CHECK_TRM_M64: #define __XSAVES__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -144,3 +144,8 @@
 // RUN: %clang -target i386-linux-gnu -mretpoline -mno-retpoline-external-thunk %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-RETPOLINE-EXTERNAL-THUNK %s
 // RETPOLINE-EXTERNAL-THUNK: "-target-feature" "+retpoline-external-thunk"
 // NO-RETPOLINE-EXTERNAL-THUNK: "-target-feature" "-retpoline-external-thunk"
+
+// RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=WAITPKG %s
+// RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
+// WAITPKG: "-target-feature" "+waitpkg"
+// NO-WAITPKG: "-target-feature" "-waitpkg"
Index: test/CodeGen/waitpkg.c
===
--- /dev/null
+++ test/CodeGen/waitpkg.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -emit-llvm -target-feature +waitpkg -o - | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple i386-unknown-unknown -emit-llvm -target-feature +waitpkg -o - | FileCheck %s
+
+#include 
+
+#include 
+#include 
+
+void test_umonitor(void *address) {
+  //CHECK-LABEL: @test_umonitor
+  //CHECK: call void @llvm.x86.umonitor(i8* %{{.*}})
+  return _umonitor(address);
+}
+
+uint8_t test_umwait(uint32_t control, uint64_t counter) {
+  //CHECK-LABEL: @test_umwait
+  //CHECK: call i8 @llvm.x86.umwait(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _umwait(control, counter);
+}
+
+uint8_t test_tpause(uint32_t control, uint64_t counter) {
+  //CHECK-LABEL: @test_tpause
+  //CHECK: call i8 @llvm.x86.tpause(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _tpause(control, counter);
+}
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -96,4 +96,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__WAITPKG__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/waitpkgintrin.h
===
--- /dev/null
+++ lib/Headers/waitpkgintrin.h
@@ -0,0 +1,56 @@
+/*===--- waitpkgintrin.h - WAITPKG ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef _WAITPKGINTRIN_H
+#define _WAITPKGINTRIN_H
+
+/* Define the default attribu

[PATCH] D45254: [X86][WAITPKG] WaitPKG intrinsics

2018-04-18 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 142910.
GBuella added a comment.

Modified the intrinsic.


https://reviews.llvm.org/D45254

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/waitpkgintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/waitpkg.c
  test/Driver/x86-target-features.c

Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -144,3 +144,8 @@
 // RUN: %clang -target i386-linux-gnu -mretpoline -mno-retpoline-external-thunk %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-RETPOLINE-EXTERNAL-THUNK %s
 // RETPOLINE-EXTERNAL-THUNK: "-target-feature" "+retpoline-external-thunk"
 // NO-RETPOLINE-EXTERNAL-THUNK: "-target-feature" "-retpoline-external-thunk"
+
+// RUN: %clang -target i386-linux-gnu -mwaitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=WAITPKG %s
+// RUN: %clang -target i386-linux-gnu -mno-waitpkg %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WAITPKG %s
+// WAITPKG: "-target-feature" "+waitpkg"
+// NO-WAITPKG: "-target-feature" "-waitpkg"
Index: test/CodeGen/waitpkg.c
===
--- /dev/null
+++ test/CodeGen/waitpkg.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple x86_64-unknown-unknown -emit-llvm -target-feature +waitpkg -o - | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple i386-unknown-unknown -emit-llvm -target-feature +waitpkg -o - | FileCheck %s
+
+#include 
+
+#include 
+#include 
+
+void test_umonitor(void *address) {
+  //CHECK-LABEL: @test_umonitor
+  //CHECK: call void @llvm.x86.umonitor(i8* %{{.*}})
+  return _umonitor(address);
+}
+
+uint8_t test_umwait(uint32_t control, uint64_t counter) {
+  //CHECK-LABEL: @test_umwait
+  //CHECK: call i8 @llvm.x86.umwait(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _umwait(control, counter);
+}
+
+uint8_t test_tpause(uint32_t control, uint64_t counter) {
+  //CHECK-LABEL: @test_tpause
+  //CHECK: call i8 @llvm.x86.tpause(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}})
+  return _tpause(control, counter);
+}
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -96,4 +96,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__WAITPKG__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/waitpkgintrin.h
===
--- /dev/null
+++ lib/Headers/waitpkgintrin.h
@@ -0,0 +1,56 @@
+/*===--- waitpkgintrin.h - WAITPKG ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef _WAITPKGINTRIN_H
+#define _WAITPKGINTRIN_H
+
+/* Define the default attributes for the functions in this file. */
+#define __DEFAULT_FN_ATTRS \
+  __attribute__((__always_inline__, __nodebug__,  __target__("waitpkg")))
+
+static __inline__ void __DEFAULT_FN_ATTRS
+_umonitor (void * __ADDRESS)
+{
+  __builtin_ia32_umonitor (__ADDRESS);
+}
+
+static __inline__ __UINT8_TYPE__ __DEFAULT_FN_ATTRS
+_umwait (__UINT32_TYPE__ __CONTROL, __UINT64_TYPE__ __COUNTER)
+{
+  return __builtin_ia32_umwait (__CONTROL,
+(unsigned int)(__COUNTER >> 32), (unsigned int)__COUNTER);
+}
+
+static __inline__ __UINT8_TYPE__ __DEFAULT_FN_ATTRS
+_tpause (__UINT32_TYPE__ __CONTROL, __UINT64_TYPE__ __COUNTER)
+{
+  return __builtin_ia32_tpause (__CONTROL,
+(unsigned int)(__COUNTER >> 32), (unsigned int)__COUNTER);
+}
+
+#undef __DEFAULT_FN_ATTRS
+
+#endif /* _WAITPKGINTRIN_H */
Index: lib/Header

[PATCH] D45613: [X86] Introduce archs: goldmont-plus & tremont

2018-04-16 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL330110: [X86] Introduce archs: goldmont-plus & tremont 
(authored by GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Repository:
  rL LLVM

https://reviews.llvm.org/D45613

Files:
  cfe/trunk/include/clang/Basic/X86Target.def
  cfe/trunk/lib/Basic/Targets/X86.cpp
  cfe/trunk/test/Driver/x86-march.c
  cfe/trunk/test/Misc/target-invalid-cpu-note.c
  cfe/trunk/test/Preprocessor/predefined-arch-macros.c

Index: cfe/trunk/lib/Basic/Targets/X86.cpp
===
--- cfe/trunk/lib/Basic/Targets/X86.cpp
+++ cfe/trunk/lib/Basic/Targets/X86.cpp
@@ -244,6 +244,14 @@
 setFeatureEnabledImpl(Features, "fxsr", true);
 break;
 
+  case CK_Tremont:
+setFeatureEnabledImpl(Features, "cldemote", true);
+setFeatureEnabledImpl(Features, "gfni", true);
+LLVM_FALLTHROUGH;
+  case CK_GoldmontPlus:
+setFeatureEnabledImpl(Features, "rdpid", true);
+setFeatureEnabledImpl(Features, "sgx", true);
+LLVM_FALLTHROUGH;
   case CK_Goldmont:
 setFeatureEnabledImpl(Features, "sha", true);
 setFeatureEnabledImpl(Features, "rdseed", true);
@@ -930,6 +938,12 @@
   case CK_Goldmont:
 defineCPUMacros(Builder, "goldmont");
 break;
+  case CK_GoldmontPlus:
+defineCPUMacros(Builder, "goldmont_plus");
+break;
+  case CK_Tremont:
+defineCPUMacros(Builder, "tremont");
+break;
   case CK_Nehalem:
   case CK_Westmere:
   case CK_SandyBridge:
Index: cfe/trunk/include/clang/Basic/X86Target.def
===
--- cfe/trunk/include/clang/Basic/X86Target.def
+++ cfe/trunk/include/clang/Basic/X86Target.def
@@ -104,6 +104,9 @@
 PROC_ALIAS(Silvermont, "slm")
 
 PROC(Goldmont, "goldmont", PROC_64_BIT)
+PROC(GoldmontPlus, "goldmont-plus", PROC_64_BIT)
+
+PROC(Tremont, "tremont", PROC_64_BIT)
 //@}
 
 /// \name Nehalem
Index: cfe/trunk/test/Misc/target-invalid-cpu-note.c
===
--- cfe/trunk/test/Misc/target-invalid-cpu-note.c
+++ cfe/trunk/test/Misc/target-invalid-cpu-note.c
@@ -13,7 +13,7 @@
 // X86: note: valid target CPU values are: i386, i486, winchip-c6, winchip2, c3,
 // X86-SAME: i586, pentium, pentium-mmx, pentiumpro, i686, pentium2, pentium3,
 // X86-SAME: pentium3m, pentium-m, c3-2, yonah, pentium4, pentium4m, prescott,
-// X86-SAME: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont,
+// X86-SAME: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont,
 // X86-SAME: nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge,
 // X86-SAME: core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512,
 // X86-SAME: skx, cannonlake, icelake-client, icelake-server, knl, knm, lakemont, k6, k6-2, k6-3,
@@ -25,7 +25,7 @@
 // RUN: not %clang_cc1 -triple x86_64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix X86_64
 // X86_64: error: unknown target CPU 'not-a-cpu'
 // X86_64: note: valid target CPU values are: nocona, core2, penryn, bonnell,
-// X86_64-SAME: atom, silvermont, slm, goldmont, nehalem, corei7, westmere,
+// X86_64-SAME: atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere,
 // X86_64-SAME: sandybridge, corei7-avx, ivybridge, core-avx-i, haswell,
 // X86_64-SAME: core-avx2, broadwell, skylake, skylake-avx512, skx, cannonlake,
 // X86_64-SAME: icelake-client, icelake-server, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3,
Index: cfe/trunk/test/Preprocessor/predefined-arch-macros.c
===
--- cfe/trunk/test/Preprocessor/predefined-arch-macros.c
+++ cfe/trunk/test/Preprocessor/predefined-arch-macros.c
@@ -1371,6 +1371,7 @@
 // CHECK_GLM_M64: #define __PRFCHW__ 1
 // CHECK_GLM_M64: #define __RDRND__ 1
 // CHECK_GLM_M64: #define __RDSEED__ 1
+// CHECK_GLM_M64: #define __SHA__ 1
 // CHECK_GLM_M64: #define __SSE2__ 1
 // CHECK_GLM_M64: #define __SSE3__ 1
 // CHECK_GLM_M64: #define __SSE4_1__ 1
@@ -1387,6 +1388,146 @@
 // CHECK_GLM_M64: #define __x86_64 1
 // CHECK_GLM_M64: #define __x86_64__ 1
 
+// RUN: %clang -march=goldmont-plus -m32 -E -dM %s -o - 2>&1 \
+// RUN: -target i386-unknown-linux \
+// RUN:   | FileCheck %s -check-prefix=CHECK_GLMP_M32
+// CHECK_GLMP_M32: #define __AES__ 1
+// CHECK_GLMP_M32: #define __CLFLUSHOPT__ 1
+// CHECK_GLMP_M32: #define __FSGSBASE__ 1
+// CHECK_GLMP_M32: #define __FXSR__ 1
+// CHECK_GLMP_M32: #define __MMX__ 1
+// CHECK_GLMP_M32: #define __MPX__ 1
+// CHECK_GLMP_M32: #define __PCLMUL__ 1
+// CHECK_GLMP_M32: #define __POPCNT__ 1
+// CHECK_GLMP_M32: #define __PRFCHW__ 1
+// CHECK_GLMP_M32: #define __RDPID__ 1
+// CHECK_GLMP_M32: #define __RDRND__ 1
+// CHECK_GLMP_M32: #define __RDSEED__ 1
+// CHECK_GLMP_M32: #define __SGX__ 1
+// CHECK_GLMP_M32: #define __SHA__ 1
+// CHECK_GLMP_M32: #define __SSE2__ 1
+

[PATCH] D45613: [X86] Introduce archs: goldmont-plus & tremont

2018-04-13 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

I was not sure about what predefined macros to add for goldmont-plus & tremont.
I found this conversation: https://reviews.llvm.org/D38824 which seems to 
suggest no new arch specifix macros.
But apparently atom archs have their own macros.


Repository:
  rC Clang

https://reviews.llvm.org/D45613



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-04-13 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added a reviewer: craig.topper.
Herald added a subscriber: cfe-commits.

The fcmp opcode has no defined behavior with NaN operands
in the comparisions handled in this patch. Thus, these
intrinsics can only be safe lowered to fcmp opcodes
when fast-math is enabled.

This also partially reverses the commit
259aee973f1df2cba62cc3d32847dced739e9202 :
"[X86] Use native IR for immediate values 0-7 of packed fp cmp builtins."

That commit lowered 4 SSE cmp builtins to fcmp opcodes, but with
this patch, that lowering is also dependent on the -ffast-math flag.

Affected AVX512 builtins (were not lowered to fcmp before):

__builtin_ia32_cmpps128_mask
__builtin_ia32_cmpps256_mask
__builtin_ia32_cmpps512_mask
__builtin_ia32_cmppd128_mask
__builtin_ia32_cmppd256_mask
__builtin_ia32_cmppd512_mask

Affected SSE builtins (were unconditionally lowered before):

__builtin_ia32_cmpps
__builtin_ia32_cmpps256
__builtin_ia32_cmppd
__builtin_ia32_cmppd256

At the same time, recognize predicates that result in
constants with all of the above intrinsics, that was only
handled in the case of the __builtin_ia32_cmpp[s|d]256 intrinsics.


Repository:
  rC Clang

https://reviews.llvm.org/D45616

Files:
  lib/CodeGen/CGBuiltin.cpp
  test/CodeGen/avx-builtins.c
  test/CodeGen/avx2-builtins-fast-math.c
  test/CodeGen/avx2-builtins.c
  test/CodeGen/avx512f-builtins-fast-math.c
  test/CodeGen/avx512f-builtins.c
  test/CodeGen/avx512vl-builtins-fast-math.c
  test/CodeGen/avx512vl-builtins.c

Index: test/CodeGen/avx512vl-builtins.c
===
--- test/CodeGen/avx512vl-builtins.c
+++ test/CodeGen/avx512vl-builtins.c
@@ -1077,6 +1077,34 @@
   return (__mmask8)_mm256_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_ps_mask_true_uq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: store i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_true_us(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: store i8 -1
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_oq(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: store i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_ps_mask_false_os(__m256 __A, __m256 __B) {
+  // CHECK-LABEL: @test_mm256_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: store i8 0
+  return (__mmask8)_mm256_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm256_mask_cmp_ps_mask(__mmask8 m, __m256 __A, __m256 __B) {
   // CHECK-LABEL: @test_mm256_mask_cmp_ps_mask
   // CHECK: [[CMP:%.*]] = call <8 x i1> @llvm.x86.avx512.mask.cmp.ps.256
@@ -1090,6 +1118,34 @@
   return (__mmask8)_mm_cmp_ps_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm_cmp_ps_mask_true_uq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: store i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_true_us(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: store i8 -1
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_oq(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: store i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm_cmp_ps_mask_false_os(__m128 __A, __m128 __B) {
+  // CHECK-LABEL: @test_mm_cmp_ps_mask_false_os
+  // CHECK-NOT: call
+  // CHECK: store i8 0
+  return (__mmask8)_mm_cmp_ps_mask(__A, __B, _CMP_FALSE_OS);
+}
+
 __mmask8 test_mm_mask_cmp_ps_mask(__mmask8 m, __m128 __A, __m128 __B) {
   // CHECK-LABEL: @test_mm_mask_cmp_ps_mask
   // CHECK: [[CMP:%.*]] = call <4 x i1> @llvm.x86.avx512.mask.cmp.ps.128
@@ -1103,6 +1159,34 @@
   return (__mmask8)_mm256_cmp_pd_mask(__A, __B, 0);
 }
 
+__mmask8 test_mm256_cmp_pd_mask_true_uq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_uq
+  // CHECK-NOT: call
+  // CHECK: store i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_UQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_true_us(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_true_us
+  // CHECK-NOT: call
+  // CHECK: store i8 -1
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_TRUE_US);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_oq(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask_false_oq
+  // CHECK-NOT: call
+  // CHECK: store i8 0
+  return (__mmask8)_mm256_cmp_pd_mask(__A, __B, _CMP_FALSE_OQ);
+}
+
+__mmask8 test_mm256_cmp_pd_mask_false_os(__m256d __A, __m256d __B) {
+  // CHECK-LABEL: @test_mm256_cmp_pd_mask

[PATCH] D45613: [X86] Introduce archs: goldmont-plus & tremont

2018-04-13 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added a reviewer: craig.topper.
Herald added a subscriber: cfe-commits.

Repository:
  rC Clang

https://reviews.llvm.org/D45613

Files:
  include/clang/Basic/X86Target.def
  lib/Basic/Targets/X86.cpp
  test/Driver/x86-march.c
  test/Misc/target-invalid-cpu-note.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1371,6 +1371,7 @@
 // CHECK_GLM_M64: #define __PRFCHW__ 1
 // CHECK_GLM_M64: #define __RDRND__ 1
 // CHECK_GLM_M64: #define __RDSEED__ 1
+// CHECK_GLM_M64: #define __SHA__ 1
 // CHECK_GLM_M64: #define __SSE2__ 1
 // CHECK_GLM_M64: #define __SSE3__ 1
 // CHECK_GLM_M64: #define __SSE4_1__ 1
@@ -1387,6 +1388,146 @@
 // CHECK_GLM_M64: #define __x86_64 1
 // CHECK_GLM_M64: #define __x86_64__ 1
 
+// RUN: %clang -march=goldmont-plus -m32 -E -dM %s -o - 2>&1 \
+// RUN: -target i386-unknown-linux \
+// RUN:   | FileCheck %s -check-prefix=CHECK_GLMP_M32
+// CHECK_GLMP_M32: #define __AES__ 1
+// CHECK_GLMP_M32: #define __CLFLUSHOPT__ 1
+// CHECK_GLMP_M32: #define __FSGSBASE__ 1
+// CHECK_GLMP_M32: #define __FXSR__ 1
+// CHECK_GLMP_M32: #define __MMX__ 1
+// CHECK_GLMP_M32: #define __MPX__ 1
+// CHECK_GLMP_M32: #define __PCLMUL__ 1
+// CHECK_GLMP_M32: #define __POPCNT__ 1
+// CHECK_GLMP_M32: #define __PRFCHW__ 1
+// CHECK_GLMP_M32: #define __RDPID__ 1
+// CHECK_GLMP_M32: #define __RDRND__ 1
+// CHECK_GLMP_M32: #define __RDSEED__ 1
+// CHECK_GLMP_M32: #define __SGX__ 1
+// CHECK_GLMP_M32: #define __SHA__ 1
+// CHECK_GLMP_M32: #define __SSE2__ 1
+// CHECK_GLMP_M32: #define __SSE3__ 1
+// CHECK_GLMP_M32: #define __SSE4_1__ 1
+// CHECK_GLMP_M32: #define __SSE4_2__ 1
+// CHECK_GLMP_M32: #define __SSE_MATH__ 1
+// CHECK_GLMP_M32: #define __SSE__ 1
+// CHECK_GLMP_M32: #define __SSSE3__ 1
+// CHECK_GLMP_M32: #define __XSAVEC__ 1
+// CHECK_GLMP_M32: #define __XSAVEOPT__ 1
+// CHECK_GLMP_M32: #define __XSAVES__ 1
+// CHECK_GLMP_M32: #define __XSAVE__ 1
+// CHECK_GLMP_M32: #define __goldmont_plus 1
+// CHECK_GLMP_M32: #define __goldmont_plus__ 1
+// CHECK_GLMP_M32: #define __i386 1
+// CHECK_GLMP_M32: #define __i386__ 1
+// CHECK_GLMP_M32: #define __tune_goldmont_plus__ 1
+// CHECK_GLMP_M32: #define i386 1
+
+// RUN: %clang -march=goldmont-plus -m64 -E -dM %s -o - 2>&1 \
+// RUN: -target i386-unknown-linux \
+// RUN:   | FileCheck %s -check-prefix=CHECK_GLMP_M64
+// CHECK_GLMP_M64: #define __AES__ 1
+// CHECK_GLMP_M64: #define __CLFLUSHOPT__ 1
+// CHECK_GLMP_M64: #define __FSGSBASE__ 1
+// CHECK_GLMP_M64: #define __FXSR__ 1
+// CHECK_GLMP_M64: #define __MMX__ 1
+// CHECK_GLMP_M64: #define __MPX__ 1
+// CHECK_GLMP_M64: #define __PCLMUL__ 1
+// CHECK_GLMP_M64: #define __POPCNT__ 1
+// CHECK_GLMP_M64: #define __PRFCHW__ 1
+// CHECK_GLMP_M64: #define __RDPID__ 1
+// CHECK_GLMP_M64: #define __RDRND__ 1
+// CHECK_GLMP_M64: #define __RDSEED__ 1
+// CHECK_GLMP_M64: #define __SGX__ 1
+// CHECK_GLMP_M64: #define __SHA__ 1
+// CHECK_GLMP_M64: #define __SSE2__ 1
+// CHECK_GLMP_M64: #define __SSE3__ 1
+// CHECK_GLMP_M64: #define __SSE4_1__ 1
+// CHECK_GLMP_M64: #define __SSE4_2__ 1
+// CHECK_GLMP_M64: #define __SSE__ 1
+// CHECK_GLMP_M64: #define __SSSE3__ 1
+// CHECK_GLMP_M64: #define __XSAVEC__ 1
+// CHECK_GLMP_M64: #define __XSAVEOPT__ 1
+// CHECK_GLMP_M64: #define __XSAVES__ 1
+// CHECK_GLMP_M64: #define __XSAVE__ 1
+// CHECK_GLMP_M64: #define __goldmont_plus 1
+// CHECK_GLMP_M64: #define __goldmont_plus__ 1
+// CHECK_GLMP_M64: #define __tune_goldmont_plus__ 1
+// CHECK_GLMP_M64: #define __x86_64 1
+// CHECK_GLMP_M64: #define __x86_64__ 1
+
+// RUN: %clang -march=tremont -m32 -E -dM %s -o - 2>&1 \
+// RUN: -target i386-unknown-linux \
+// RUN:   | FileCheck %s -check-prefix=CHECK_TRM_M32
+// CHECK_TRM_M32: #define __AES__ 1
+// CHECK_TRM_M32: #define __CLDEMOTE__ 1
+// CHECK_TRM_M32: #define __CLFLUSHOPT__ 1
+// CHECK_TRM_M32: #define __FSGSBASE__ 1
+// CHECK_TRM_M32: #define __FXSR__ 1
+// CHECK_TRM_M32: #define __GFNI__ 1
+// CHECK_TRM_M32: #define __MMX__ 1
+// CHECK_TRM_M32: #define __MPX__ 1
+// CHECK_TRM_M32: #define __PCLMUL__ 1
+// CHECK_TRM_M32: #define __POPCNT__ 1
+// CHECK_TRM_M32: #define __PRFCHW__ 1
+// CHECK_TRM_M32: #define __RDPID__ 1
+// CHECK_TRM_M32: #define __RDRND__ 1
+// CHECK_TRM_M32: #define __RDSEED__ 1
+// CHECK_TRM_M32: #define __SGX__ 1
+// CHECK_TRM_M32: #define __SHA__ 1
+// CHECK_TRM_M32: #define __SSE2__ 1
+// CHECK_TRM_M32: #define __SSE3__ 1
+// CHECK_TRM_M32: #define __SSE4_1__ 1
+// CHECK_TRM_M32: #define __SSE4_2__ 1
+// CHECK_TRM_M32: #define __SSE_MATH__ 1
+// CHECK_TRM_M32: #define __SSE__ 1
+// CHECK_TRM_M32: #define __SSSE3__ 1
+// CHECK_TRM_M32: #define __XSAVEC__ 1
+// CHECK_TRM_M32: #define __XSAVEOPT__ 1
+// CHECK_TRM_M32: #define __XSAVES__ 1
+// CHECK_TRM_M32: #define __XSAVE__ 1
+// CHECK_TRM_M32: #define __i386 1
+// C

[PATCH] D45257: [X86] Introduce cldemote intrinsic

2018-04-13 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL329993: [X86] Introduce cldemote intrinsic (authored by 
GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D45257?vs=142340&id=142345#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D45257

Files:
  cfe/trunk/include/clang/Basic/BuiltinsX86.def
  cfe/trunk/include/clang/Driver/Options.td
  cfe/trunk/lib/Basic/Targets/X86.cpp
  cfe/trunk/lib/Basic/Targets/X86.h
  cfe/trunk/lib/Headers/CMakeLists.txt
  cfe/trunk/lib/Headers/cldemoteintrin.h
  cfe/trunk/lib/Headers/cpuid.h
  cfe/trunk/lib/Headers/x86intrin.h
  cfe/trunk/test/CodeGen/builtins-x86.c
  cfe/trunk/test/CodeGen/cldemote.c

Index: cfe/trunk/lib/Basic/Targets/X86.h
===
--- cfe/trunk/lib/Basic/Targets/X86.h
+++ cfe/trunk/lib/Basic/Targets/X86.h
@@ -91,6 +91,7 @@
   bool HasXSAVES = false;
   bool HasMWAITX = false;
   bool HasCLZERO = false;
+  bool HasCLDEMOTE = false;
   bool HasPKU = false;
   bool HasCLFLUSHOPT = false;
   bool HasCLWB = false;
Index: cfe/trunk/lib/Basic/Targets/X86.cpp
===
--- cfe/trunk/lib/Basic/Targets/X86.cpp
+++ cfe/trunk/lib/Basic/Targets/X86.cpp
@@ -800,6 +800,8 @@
   HasPREFETCHWT1 = true;
 } else if (Feature == "+clzero") {
   HasCLZERO = true;
+} else if (Feature == "+cldemote") {
+  HasCLDEMOTE = true;
 } else if (Feature == "+rdpid") {
   HasRDPID = true;
 } else if (Feature == "+retpoline") {
@@ -1154,6 +1156,8 @@
 Builder.defineMacro("__CLZERO__");
   if (HasRDPID)
 Builder.defineMacro("__RDPID__");
+  if (HasCLDEMOTE)
+Builder.defineMacro("__CLDEMOTE__");
 
   // Each case falls through to the previous one here.
   switch (SSELevel) {
@@ -1263,6 +1267,7 @@
   .Case("avx512ifma", true)
   .Case("bmi", true)
   .Case("bmi2", true)
+  .Case("cldemote", true)
   .Case("clflushopt", true)
   .Case("clwb", true)
   .Case("clzero", true)
@@ -1334,6 +1339,7 @@
   .Case("avx512ifma", HasAVX512IFMA)
   .Case("bmi", HasBMI)
   .Case("bmi2", HasBMI2)
+  .Case("cldemote", HasCLDEMOTE)
   .Case("clflushopt", HasCLFLUSHOPT)
   .Case("clwb", HasCLWB)
   .Case("clzero", HasCLZERO)
Index: cfe/trunk/lib/Headers/cldemoteintrin.h
===
--- cfe/trunk/lib/Headers/cldemoteintrin.h
+++ cfe/trunk/lib/Headers/cldemoteintrin.h
@@ -0,0 +1,42 @@
+/*=== cldemoteintrin.h - CLDEMOTE intrinsic ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __X86INTRIN_H
+#error "Never use  directly; include  instead."
+#endif
+
+#ifndef __CLDEMOTEINTRIN_H
+#define __CLDEMOTEINTRIN_H
+
+/* Define the default attributes for the functions in this file. */
+#define __DEFAULT_FN_ATTRS \
+  __attribute__((__always_inline__, __nodebug__,  __target__("cldemote")))
+
+static __inline__ void __DEFAULT_FN_ATTRS
+_cldemote(const void * __P) {
+  __builtin_ia32_cldemote(__P);
+}
+
+#undef __DEFAULT_FN_ATTRS
+
+#endif
Index: cfe/trunk/lib/Headers/CMakeLists.txt
===
--- cfe/trunk/lib/Headers/CMakeLists.txt
+++ cfe/trunk/lib/Headers/CMakeLists.txt
@@ -40,6 +40,7 @@
   __clang_cuda_math_forward_declares.h
   __clang_cuda_runtime_wrapper.h
   cetintrin.h
+  cldemoteintrin.h
   clzerointrin.h
   cpuid.h
   clflushoptintrin.h
Index: cfe/trunk/lib/Headers/x86intrin.h
===
--- cfe/trunk/lib/Headers/x86intrin.h
+++ cfe/trunk/lib/Headers/x86intrin.h
@@ -92,4 +92,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__CLDEMOT

[PATCH] D45257: [X86] Introduce cldemote intrinsic

2018-04-13 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 142340.
GBuella added a comment.

Rebase.


https://reviews.llvm.org/D45257

Files:
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cldemoteintrin.h
  lib/Headers/cpuid.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtins-x86.c
  test/CodeGen/cldemote.c

Index: test/CodeGen/cldemote.c
===
--- /dev/null
+++ test/CodeGen/cldemote.c
@@ -0,0 +1,10 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -target-feature +cldemote -emit-llvm -o - -Wall -Werror | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown -target-feature +cldemote -emit-llvm -o - -Wall -Werror | FileCheck %s
+
+#include 
+
+void test_cldemote(const void *p) {
+  //CHECK-LABEL: @test_cldemote
+  //CHECK: call void @llvm.x86.cldemote(i8* %{{.*}})
+  _cldemote(p);
+}
Index: test/CodeGen/builtins-x86.c
===
--- test/CodeGen/builtins-x86.c
+++ test/CodeGen/builtins-x86.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -target-feature +wbnoinvd -emit-llvm -o %t %s
-// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -target-feature +wbnoinvd -fsyntax-only -o %t %s
+// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -target-feature +wbnoinvd -target-feature +cldemote -emit-llvm -o %t %s
+// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -target-feature +wbnoinvd -target-feature +cldemote -fsyntax-only -o %t %s
 
 #ifdef USE_ALL
 #define USE_3DNOW
@@ -295,6 +295,7 @@
   (void) __builtin_ia32_monitorx(tmp_vp, tmp_Ui, tmp_Ui);
   (void) __builtin_ia32_mwaitx(tmp_Ui, tmp_Ui, tmp_Ui);
   (void) __builtin_ia32_clzero(tmp_vp);
+  (void) __builtin_ia32_cldemote(tmp_vp);
 
   tmp_V4f = __builtin_ia32_cvtpi2ps(tmp_V4f, tmp_V2i);
   tmp_V2i = __builtin_ia32_cvtps2pi(tmp_V4f);
Index: lib/Headers/x86intrin.h
===
--- lib/Headers/x86intrin.h
+++ lib/Headers/x86intrin.h
@@ -92,4 +92,8 @@
 #include 
 #endif
 
+#if !defined(_MSC_VER) || __has_feature(modules) || defined(__CLDEMOTE__)
+#include 
+#endif
+
 #endif /* __X86INTRIN_H */
Index: lib/Headers/cpuid.h
===
--- lib/Headers/cpuid.h
+++ lib/Headers/cpuid.h
@@ -186,6 +186,7 @@
 #define bit_AVX512BITALG 0x1000
 #define bit_AVX512VPOPCNTDQ  0x4000
 #define bit_RDPID0x0040
+#define bit_CLDEMOTE 0x0200
 
 /* Features in %edx for leaf 7 sub-leaf 0 */
 #define bit_AVX5124VNNIW  0x0004
Index: lib/Headers/cldemoteintrin.h
===
--- /dev/null
+++ lib/Headers/cldemoteintrin.h
@@ -0,0 +1,42 @@
+/*=== cldemoteintrin.h - CLDEMOTE intrinsic ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---

[PATCH] D45311: [X86] Introduce wbinvd intrinsic

2018-04-12 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL329937: [X86] Introduce wbinvd intrinsic (authored by 
GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D45311?vs=142142&id=142228#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D45311

Files:
  cfe/trunk/include/clang/Basic/BuiltinsX86.def
  cfe/trunk/lib/Headers/ia32intrin.h
  cfe/trunk/test/CodeGen/builtin-wbinvd.c


Index: cfe/trunk/lib/Headers/ia32intrin.h
===
--- cfe/trunk/lib/Headers/ia32intrin.h
+++ cfe/trunk/lib/Headers/ia32intrin.h
@@ -70,4 +70,9 @@
 
 #define _rdpmc(A) __rdpmc(A)
 
+static __inline__ void __attribute__((__always_inline__, __nodebug__))
+_wbinvd(void) {
+  return __builtin_ia32_wbinvd();
+}
+
 #endif /* __IA32INTRIN_H */
Index: cfe/trunk/include/clang/Basic/BuiltinsX86.def
===
--- cfe/trunk/include/clang/Basic/BuiltinsX86.def
+++ cfe/trunk/include/clang/Basic/BuiltinsX86.def
@@ -679,7 +679,8 @@
 //CLWB
 TARGET_BUILTIN(__builtin_ia32_clwb, "vvC*", "", "clwb")
 
-//WBNOINVD
+//WB[NO]INVD
+TARGET_BUILTIN(__builtin_ia32_wbinvd, "v", "", "")
 TARGET_BUILTIN(__builtin_ia32_wbnoinvd, "v", "", "wbnoinvd")
 
 // ADX
Index: cfe/trunk/test/CodeGen/builtin-wbinvd.c
===
--- cfe/trunk/test/CodeGen/builtin-wbinvd.c
+++ cfe/trunk/test/CodeGen/builtin-wbinvd.c
@@ -0,0 +1,10 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -emit-llvm 
-o - -Wall -Werror | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown -emit-llvm 
-o - -Wall -Werror | FileCheck %s
+
+#include 
+
+void test_wbinvd(void) {
+  //CHECK-LABEL: @test_wbinvd
+  //CHECK: call void @llvm.x86.wbinvd()
+  _wbinvd();
+}


Index: cfe/trunk/lib/Headers/ia32intrin.h
===
--- cfe/trunk/lib/Headers/ia32intrin.h
+++ cfe/trunk/lib/Headers/ia32intrin.h
@@ -70,4 +70,9 @@
 
 #define _rdpmc(A) __rdpmc(A)
 
+static __inline__ void __attribute__((__always_inline__, __nodebug__))
+_wbinvd(void) {
+  return __builtin_ia32_wbinvd();
+}
+
 #endif /* __IA32INTRIN_H */
Index: cfe/trunk/include/clang/Basic/BuiltinsX86.def
===
--- cfe/trunk/include/clang/Basic/BuiltinsX86.def
+++ cfe/trunk/include/clang/Basic/BuiltinsX86.def
@@ -679,7 +679,8 @@
 //CLWB
 TARGET_BUILTIN(__builtin_ia32_clwb, "vvC*", "", "clwb")
 
-//WBNOINVD
+//WB[NO]INVD
+TARGET_BUILTIN(__builtin_ia32_wbinvd, "v", "", "")
 TARGET_BUILTIN(__builtin_ia32_wbnoinvd, "v", "", "wbnoinvd")
 
 // ADX
Index: cfe/trunk/test/CodeGen/builtin-wbinvd.c
===
--- cfe/trunk/test/CodeGen/builtin-wbinvd.c
+++ cfe/trunk/test/CodeGen/builtin-wbinvd.c
@@ -0,0 +1,10 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -emit-llvm -o - -Wall -Werror | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown -emit-llvm -o - -Wall -Werror | FileCheck %s
+
+#include 
+
+void test_wbinvd(void) {
+  //CHECK-LABEL: @test_wbinvd
+  //CHECK: call void @llvm.x86.wbinvd()
+  _wbinvd();
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45561: NFC - Indentation fixes in predefined-arch-macros.c

2018-04-12 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rC329932: NFC - Indentation fixes in predefined-arch-macros.c 
(authored by GBuella, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D45561?vs=142138&id=142223#toc

Repository:
  rC Clang

https://reviews.llvm.org/D45561

Files:
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1,5 +1,5 @@
 // Begin X86/GCC/Linux tests 
-//
+
 // RUN: %clang -march=i386 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I386_M32
@@ -11,7 +11,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I386_M64
 // CHECK_I386_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=i486 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I486_M32
@@ -25,7 +25,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I486_M64
 // CHECK_I486_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=i586 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I586_M32
@@ -42,7 +42,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I586_M64
 // CHECK_I586_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentium -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM_M32
@@ -59,7 +59,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM_M64
 // CHECK_PENTIUM_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentium-mmx -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM_MMX_M32
@@ -79,7 +79,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM_MMX_M64
 // CHECK_PENTIUM_MMX_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=winchip-c6 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_WINCHIP_C6_M32
@@ -94,7 +94,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_WINCHIP_C6_M64
 // CHECK_WINCHIP_C6_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=winchip2 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_WINCHIP2_M32
@@ -110,7 +110,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_WINCHIP2_M64
 // CHECK_WINCHIP2_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=c3 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_C3_M32
@@ -126,7 +126,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_C3_M64
 // CHECK_C3_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=c3-2 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_C3_2_M32
@@ -146,7 +146,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_C3_2_M64
 // CHECK_C3_2_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=i686 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I686_M32
@@ -163,7 +163,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I686_M64
 // CHECK_I686_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentiumpro -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUMPRO_M32
@@ -180,7 +180,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUMPRO_M64
 // CHECK_PENTIUMPRO_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentium2 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM2_M32
@@ -199,7 +199,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM2_M64
 // CHECK_PENTIUM2_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentium3 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknow

[PATCH] D45311: [X86] Introduce wbinvd intrinsic

2018-04-12 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 142142.
GBuella retitled this revision from "Introduce wbinvd intrinsic" to "[X86] 
Introduce wbinvd intrinsic".

https://reviews.llvm.org/D45311

Files:
  include/clang/Basic/BuiltinsX86.def
  lib/Headers/ia32intrin.h
  test/CodeGen/builtin-wbinvd.c


Index: test/CodeGen/builtin-wbinvd.c
===
--- /dev/null
+++ test/CodeGen/builtin-wbinvd.c
@@ -0,0 +1,10 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -emit-llvm 
-o - -Wall -Werror | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown -emit-llvm 
-o - -Wall -Werror | FileCheck %s
+
+#include 
+
+void test_wbinvd(void) {
+  //CHECK-LABEL: @test_wbinvd
+  //CHECK: call void @llvm.x86.wbinvd()
+  _wbinvd();
+}
Index: lib/Headers/ia32intrin.h
===
--- lib/Headers/ia32intrin.h
+++ lib/Headers/ia32intrin.h
@@ -70,4 +70,9 @@
 
 #define _rdpmc(A) __rdpmc(A)
 
+static __inline__ void __attribute__((__always_inline__, __nodebug__))
+_wbinvd(void) {
+  return __builtin_ia32_wbinvd();
+}
+
 #endif /* __IA32INTRIN_H */
Index: include/clang/Basic/BuiltinsX86.def
===
--- include/clang/Basic/BuiltinsX86.def
+++ include/clang/Basic/BuiltinsX86.def
@@ -679,7 +679,8 @@
 //CLWB
 TARGET_BUILTIN(__builtin_ia32_clwb, "vvC*", "", "clwb")
 
-//WBNOINVD
+//WB[NO]INVD
+TARGET_BUILTIN(__builtin_ia32_wbinvd, "v", "", "")
 TARGET_BUILTIN(__builtin_ia32_wbnoinvd, "v", "", "wbnoinvd")
 
 // ADX


Index: test/CodeGen/builtin-wbinvd.c
===
--- /dev/null
+++ test/CodeGen/builtin-wbinvd.c
@@ -0,0 +1,10 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -emit-llvm -o - -Wall -Werror | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown -emit-llvm -o - -Wall -Werror | FileCheck %s
+
+#include 
+
+void test_wbinvd(void) {
+  //CHECK-LABEL: @test_wbinvd
+  //CHECK: call void @llvm.x86.wbinvd()
+  _wbinvd();
+}
Index: lib/Headers/ia32intrin.h
===
--- lib/Headers/ia32intrin.h
+++ lib/Headers/ia32intrin.h
@@ -70,4 +70,9 @@
 
 #define _rdpmc(A) __rdpmc(A)
 
+static __inline__ void __attribute__((__always_inline__, __nodebug__))
+_wbinvd(void) {
+  return __builtin_ia32_wbinvd();
+}
+
 #endif /* __IA32INTRIN_H */
Index: include/clang/Basic/BuiltinsX86.def
===
--- include/clang/Basic/BuiltinsX86.def
+++ include/clang/Basic/BuiltinsX86.def
@@ -679,7 +679,8 @@
 //CLWB
 TARGET_BUILTIN(__builtin_ia32_clwb, "vvC*", "", "clwb")
 
-//WBNOINVD
+//WB[NO]INVD
+TARGET_BUILTIN(__builtin_ia32_wbinvd, "v", "", "")
 TARGET_BUILTIN(__builtin_ia32_wbnoinvd, "v", "", "wbnoinvd")
 
 // ADX
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45561: NFC - Indentation fixes in predefined-arch-macros.c

2018-04-12 Thread Gabor Buella via Phabricator via cfe-commits

GBuella created this revision.
GBuella added a reviewer: craig.topper.
Herald added subscribers: cfe-commits, fedor.sergeev.

Consistently separating tests with empty lines.
Helps while navigating this file.


Repository:
  rC Clang

https://reviews.llvm.org/D45561

Files:
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1,5 +1,5 @@
 // Begin X86/GCC/Linux tests 
-//
+
 // RUN: %clang -march=i386 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I386_M32
@@ -11,7 +11,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I386_M64
 // CHECK_I386_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=i486 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I486_M32
@@ -25,7 +25,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I486_M64
 // CHECK_I486_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=i586 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I586_M32
@@ -42,7 +42,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I586_M64
 // CHECK_I586_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentium -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM_M32
@@ -59,7 +59,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM_M64
 // CHECK_PENTIUM_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentium-mmx -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM_MMX_M32
@@ -79,7 +79,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM_MMX_M64
 // CHECK_PENTIUM_MMX_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=winchip-c6 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_WINCHIP_C6_M32
@@ -94,7 +94,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_WINCHIP_C6_M64
 // CHECK_WINCHIP_C6_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=winchip2 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_WINCHIP2_M32
@@ -110,7 +110,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_WINCHIP2_M64
 // CHECK_WINCHIP2_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=c3 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_C3_M32
@@ -126,7 +126,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_C3_M64
 // CHECK_C3_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=c3-2 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_C3_2_M32
@@ -146,7 +146,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_C3_2_M64
 // CHECK_C3_2_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=i686 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I686_M32
@@ -163,7 +163,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_I686_M64
 // CHECK_I686_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentiumpro -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUMPRO_M32
@@ -180,7 +180,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUMPRO_M64
 // CHECK_PENTIUMPRO_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentium2 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM2_M32
@@ -199,7 +199,7 @@
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_PENTIUM2_M64
 // CHECK_PENTIUM2_M64: error: {{.*}}
-//
+
 // RUN: %clang -march=pentium3 -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix

[PATCH] D43817: [x86] wbnoinvd intrinsic

2018-04-11 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rC329848: [x86] wbnoinvd intrinsic (authored by GBuella, 
committed by ).

Repository:
  rC Clang

https://reviews.llvm.org/D43817

Files:
  docs/ClangCommandLineReference.rst
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/wbnoinvdintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtin-wbnoinvd.c
  test/CodeGen/builtins-x86.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: docs/ClangCommandLineReference.rst
===
--- docs/ClangCommandLineReference.rst
+++ docs/ClangCommandLineReference.rst
@@ -2484,6 +2484,8 @@
 
 .. option:: -mvpclmulqdq, -mno-vpclmulqdq
 
+.. option:: -mwbnoinvd, -mno-wbnoinvd
+
 .. option:: -mx87, -m80387, -mno-x87
 
 .. option:: -mxop, -mno-xop
Index: include/clang/Basic/BuiltinsX86.def
===
--- include/clang/Basic/BuiltinsX86.def
+++ include/clang/Basic/BuiltinsX86.def
@@ -679,6 +679,9 @@
 //CLWB
 TARGET_BUILTIN(__builtin_ia32_clwb, "vvC*", "", "clwb")
 
+//WBNOINVD
+TARGET_BUILTIN(__builtin_ia32_wbnoinvd, "v", "", "wbnoinvd")
+
 // ADX
 TARGET_BUILTIN(__builtin_ia32_addcarryx_u32, "UcUcUiUiUi*", "", "adx")
 TARGET_BUILTIN(__builtin_ia32_addcarry_u32, "UcUcUiUiUi*", "", "")
Index: include/clang/Driver/Options.td
===
--- include/clang/Driver/Options.td
+++ include/clang/Driver/Options.td
@@ -2598,6 +2598,8 @@
 def mno_clflushopt : Flag<["-"], "mno-clflushopt">, Group;
 def mclwb : Flag<["-"], "mclwb">, Group;
 def mno_clwb : Flag<["-"], "mno-clwb">, Group;
+def mwbnoinvd : Flag<["-"], "mwbnoinvd">, Group;
+def mno_wbnoinvd : Flag<["-"], "mno-wbnoinvd">, Group;
 def mclzero : Flag<["-"], "mclzero">, Group;
 def mno_clzero : Flag<["-"], "mno-clzero">, Group;
 def mcx16 : Flag<["-"], "mcx16">, Group;
Index: test/CodeGen/builtin-wbnoinvd.c
===
--- test/CodeGen/builtin-wbnoinvd.c
+++ test/CodeGen/builtin-wbnoinvd.c
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -target-feature +wbnoinvd -emit-llvm -o - -Wall -Werror | FileCheck %s
+
+#include 
+
+void test_wbnoinvd(void) {
+  //CHECK-LABEL: @test_wbnoinvd
+  //CHECK: call void @llvm.x86.wbnoinvd()
+  _wbnoinvd();
+}
Index: test/CodeGen/builtins-x86.c
===
--- test/CodeGen/builtins-x86.c
+++ test/CodeGen/builtins-x86.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -emit-llvm -o %t %s
-// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -fsyntax-only -o %t %s
+// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -target-feature +wbnoinvd -emit-llvm -o %t %s
+// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -target-feature +wbnoinvd -fsyntax-only -o %t %s
 
 #ifdef USE_ALL
 #define USE_3DNOW
@@ -305,6 +305,7 @@
   tmp_i = __rdtsc();
   tmp_i = __builtin_ia32_rdtscp(&tmp_Ui);
   tmp_LLi = __builtin_ia32_rdpmc(tmp_i);
+  __builtin_ia32_wbnoinvd();
 #ifdef USE_64
   tmp_LLi = __builtin_ia32_cvtss2si64(tmp_V4f);
   tmp_LLi = __builtin_ia32_cvttss2si64(tmp_V4f);
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -60,6 +60,11 @@
 // CLWB: "-target-feature" "+clwb"
 // NO-CLWB: "-target-feature" "-clwb"
 
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mwbnoinvd %s -### -o %t.o 2>&1 | FileCheck -check-prefix=WBNOINVD %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-wbnoinvd %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WBNOINVD %s
+// WBNOINVD: "-target-feature" "+wbnoinvd"
+// NO-WBNOINVD: "-target-feature" "-wbnoinvd"
+
 // RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovbe %

[PATCH] D43817: [x86] wbnoinvd intrinsic

2018-04-11 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added inline comments.



Comment at: lib/Basic/Targets/X86.cpp:188
 setFeatureEnabledImpl(Features, "mpx", true);
 if (Kind != CK_SkylakeServer) // SKX inherits all SKL features, except SGX
   setFeatureEnabledImpl(Features, "sgx", true);

craig.topper wrote:
> Not related to this patch, but does IcelakeServer have SGX? The adding 
> icelake-server patch and removing sgx from skylakeserver intersected here.
As far as I know both icelake-client & icelake-server has it.
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
says:
"ENCLV - Ice Lake Server and later ; Future Tremont and later"
enclv is an SGX leaf.


https://reviews.llvm.org/D43817



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D43817: [x86] wbnoinvd intrinsic

2018-04-10 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 141904.
GBuella added a comment.

Rebased the patch.


https://reviews.llvm.org/D43817

Files:
  docs/ClangCommandLineReference.rst
  include/clang/Basic/BuiltinsX86.def
  include/clang/Driver/Options.td
  lib/Basic/Targets/X86.cpp
  lib/Basic/Targets/X86.h
  lib/Headers/CMakeLists.txt
  lib/Headers/cpuid.h
  lib/Headers/wbnoinvdintrin.h
  lib/Headers/x86intrin.h
  test/CodeGen/builtin-wbnoinvd.c
  test/CodeGen/builtins-x86.c
  test/Driver/x86-target-features.c
  test/Preprocessor/predefined-arch-macros.c

Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -1100,6 +1100,7 @@
 // CHECK_ICL_M32: #define __SSSE3__ 1
 // CHECK_ICL_M32: #define __VAES__ 1
 // CHECK_ICL_M32: #define __VPCLMULQDQ__ 1
+// CHECK_ICL_M32-NOT: #define __WBNOINVD__ 1
 // CHECK_ICL_M32: #define __XSAVEC__ 1
 // CHECK_ICL_M32: #define __XSAVEOPT__ 1
 // CHECK_ICL_M32: #define __XSAVES__ 1
@@ -1156,6 +1157,7 @@
 // CHECK_ICL_M64: #define __SSSE3__ 1
 // CHECK_ICL_M64: #define __VAES__ 1
 // CHECK_ICL_M64: #define __VPCLMULQDQ__ 1
+// CHECK_ICL_M64-NOT: #define __WBNOINVD__ 1
 // CHECK_ICL_M64: #define __XSAVEC__ 1
 // CHECK_ICL_M64: #define __XSAVEOPT__ 1
 // CHECK_ICL_M64: #define __XSAVES__ 1
@@ -1213,6 +1215,7 @@
 // CHECK_ICX_M32: #define __SSSE3__ 1
 // CHECK_ICX_M32: #define __VAES__ 1
 // CHECK_ICX_M32: #define __VPCLMULQDQ__ 1
+// CHECK_ICX_M32: #define __WBNOINVD__ 1
 // CHECK_ICX_M32: #define __XSAVEC__ 1
 // CHECK_ICX_M32: #define __XSAVEOPT__ 1
 // CHECK_ICX_M32: #define __XSAVES__ 1
@@ -1269,6 +1272,7 @@
 // CHECK_ICX_M64: #define __SSSE3__ 1
 // CHECK_ICX_M64: #define __VAES__ 1
 // CHECK_ICX_M64: #define __VPCLMULQDQ__ 1
+// CHECK_ICX_M64: #define __WBNOINVD__ 1
 // CHECK_ICX_M64: #define __XSAVEC__ 1
 // CHECK_ICX_M64: #define __XSAVEOPT__ 1
 // CHECK_ICX_M64: #define __XSAVES__ 1
Index: test/Driver/x86-target-features.c
===
--- test/Driver/x86-target-features.c
+++ test/Driver/x86-target-features.c
@@ -60,6 +60,11 @@
 // CLWB: "-target-feature" "+clwb"
 // NO-CLWB: "-target-feature" "-clwb"
 
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mwbnoinvd %s -### -o %t.o 2>&1 | FileCheck -check-prefix=WBNOINVD %s
+// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-wbnoinvd %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-WBNOINVD %s
+// WBNOINVD: "-target-feature" "+wbnoinvd"
+// NO-WBNOINVD: "-target-feature" "-wbnoinvd"
+
 // RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mmovbe %s -### -o %t.o 2>&1 | FileCheck -check-prefix=MOVBE %s
 // RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-movbe %s -### -o %t.o 2>&1 | FileCheck -check-prefix=NO-MOVBE %s
 // MOVBE: "-target-feature" "+movbe"
Index: test/CodeGen/builtins-x86.c
===
--- test/CodeGen/builtins-x86.c
+++ test/CodeGen/builtins-x86.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -emit-llvm -o %t %s
-// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -fsyntax-only -o %t %s
+// RUN: %clang_cc1 -DUSE_64 -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +clzero -target-feature +ibt -target-feature +shstk -target-feature +wbnoinvd -emit-llvm -o %t %s
+// RUN: %clang_cc1 -DUSE_ALL -triple x86_64-unknown-unknown -target-feature +fxsr -target-feature +avx -target-feature +xsaveopt -target-feature +xsaves -target-feature +xsavec -target-feature +mwaitx -target-feature +ibt -target-feature +shstk -target-feature +clzero -target-feature +wbnoinvd -fsyntax-only -o %t %s
 
 #ifdef USE_ALL
 #define USE_3DNOW
@@ -305,6 +305,7 @@
   tmp_i = __rdtsc();
   tmp_i = __builtin_ia32_rdtscp(&tmp_Ui);
   tmp_LLi = __builtin_ia32_rdpmc(tmp_i);
+  __builtin_ia32_wbnoinvd();
 #ifdef USE_64
   tmp_LLi = __builtin_ia32_cvtss2si64(tmp_V4f);
   tmp_LLi = __builtin_ia32_cvttss2si64(tmp_V4f);
Index: test/CodeGen/builtin-wbnoinvd.c
===
--- /dev/null
+++ test/CodeGen/builtin-wbnoinvd.c
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -target-feature +wbnoinvd -emit-llvm -o - -Wall -Werror | FileCheck %s
+
+#include 
+
+void test_wbnoinvd(void) {
+  //CHECK

[PATCH] D45056: [X86] Split up -march=icelake to -client & -server

2018-04-10 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL329741: [X86] Split up -march=icelake to -client & 
-server (authored by GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D45056?vs=140911&id=141888#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D45056

Files:
  cfe/trunk/include/clang/Basic/X86Target.def
  cfe/trunk/lib/Basic/Targets/X86.cpp
  cfe/trunk/test/Driver/x86-march.c
  cfe/trunk/test/Frontend/x86-target-cpu.c
  cfe/trunk/test/Misc/target-invalid-cpu-note.c
  cfe/trunk/test/Preprocessor/predefined-arch-macros.c

Index: cfe/trunk/test/Driver/x86-march.c
===
--- cfe/trunk/test/Driver/x86-march.c
+++ cfe/trunk/test/Driver/x86-march.c
@@ -60,9 +60,13 @@
 // RUN:   | FileCheck %s -check-prefix=cannonlake
 // cannonlake: "-target-cpu" "cannonlake"
 //
-// RUN: %clang -target x86_64-unknown-unknown -c -### %s -march=icelake 2>&1 \
-// RUN:   | FileCheck %s -check-prefix=icelake
-// icelake: "-target-cpu" "icelake"
+// RUN: %clang -target x86_64-unknown-unknown -c -### %s -march=icelake-client 2>&1 \
+// RUN:   | FileCheck %s -check-prefix=icelake-client
+// icelake-client: "-target-cpu" "icelake-client"
+//
+// RUN: %clang -target x86_64-unknown-unknown -c -### %s -march=icelake-server 2>&1 \
+// RUN:   | FileCheck %s -check-prefix=icelake-server
+// icelake-server: "-target-cpu" "icelake-server"
 //
 // RUN: %clang -target x86_64-unknown-unknown -c -### %s -march=lakemont 2>&1 \
 // RUN:   | FileCheck %s -check-prefix=lakemont
Index: cfe/trunk/test/Frontend/x86-target-cpu.c
===
--- cfe/trunk/test/Frontend/x86-target-cpu.c
+++ cfe/trunk/test/Frontend/x86-target-cpu.c
@@ -13,7 +13,8 @@
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu skylake-avx512 -verify %s
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu skx -verify %s
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu cannonlake -verify %s
-// RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu icelake -verify %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu icelake-client -verify %s
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu icelake-server -verify %s
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu knl -verify %s
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu knm -verify %s
 // RUN: %clang_cc1 -triple x86_64-unknown-unknown -target-cpu bonnell -verify %s
Index: cfe/trunk/test/Misc/target-invalid-cpu-note.c
===
--- cfe/trunk/test/Misc/target-invalid-cpu-note.c
+++ cfe/trunk/test/Misc/target-invalid-cpu-note.c
@@ -16,7 +16,7 @@
 // X86-SAME: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont,
 // X86-SAME: nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge,
 // X86-SAME: core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512,
-// X86-SAME: skx, cannonlake, icelake, knl, knm, lakemont, k6, k6-2, k6-3,
+// X86-SAME: skx, cannonlake, icelake-client, icelake-server, knl, knm, lakemont, k6, k6-2, k6-3,
 // X86-SAME: athlon, athlon-tbird, athlon-xp, athlon-mp, athlon-4, k8, athlon64,
 // X86-SAME: athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10,
 // X86-SAME: barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1,
@@ -28,7 +28,7 @@
 // X86_64-SAME: atom, silvermont, slm, goldmont, nehalem, corei7, westmere,
 // X86_64-SAME: sandybridge, corei7-avx, ivybridge, core-avx-i, haswell,
 // X86_64-SAME: core-avx2, broadwell, skylake, skylake-avx512, skx, cannonlake,
-// X86_64-SAME: icelake, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3,
+// X86_64-SAME: icelake-client, icelake-server, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3,
 // X86_64-SAME: athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1,
 // X86_64-SAME: btver2, bdver1, bdver2, bdver3, bdver4, znver1, x86-64
 
Index: cfe/trunk/test/Preprocessor/predefined-arch-macros.c
===
--- cfe/trunk/test/Preprocessor/predefined-arch-macros.c
+++ cfe/trunk/test/Preprocessor/predefined-arch-macros.c
@@ -1055,7 +1055,7 @@
 // CHECK_CNL_M64: #define __x86_64 1
 // CHECK_CNL_M64: #define __x86_64__ 1
 
-// RUN: %clang -march=icelake -m32 -E -dM %s -o - 2>&1 \
+// RUN: %clang -march=icelake-client -m32 -E -dM %s -o - 2>&1 \
 // RUN: -target i386-unknown-linux \
 // RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_ICL_M32
 // CHECK_ICL_M32: #define __AES__ 1
@@ -1110,8 +1110,8 @@
 // CHECK_ICL_M32: #define __i386__ 1
 // CHECK_ICL_M32: #define __tune_corei7__ 1
 // CHECK_ICL_M32: #define i386 1
-//
-// RUN: %clang -march=icelake -m64 -E -dM %s -o - 2>&1 \
+
+// RUN: %clang -march=icelake-client -m64 -E -dM %s -o - 2>&1 \
 // RUN: -target i38

[PATCH] D45488: [X86] Disable SGX for Skylake Server - CPP test

2018-04-10 Thread Gabor Buella via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rL329710: [X86] Disable SGX for Skylake Server - CPP test 
(authored by GBuella, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D45488?vs=141853&id=141855#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D45488

Files:
  cfe/trunk/test/Preprocessor/predefined-arch-macros.c


Index: cfe/trunk/test/Preprocessor/predefined-arch-macros.c
===
--- cfe/trunk/test/Preprocessor/predefined-arch-macros.c
+++ cfe/trunk/test/Preprocessor/predefined-arch-macros.c
@@ -892,7 +892,7 @@
 // CHECK_SKX_M32: #define __RDRND__ 1
 // CHECK_SKX_M32: #define __RDSEED__ 1
 // CHECK_SKX_M32: #define __RTM__ 1
-// CHECK_SKX_M32: #define __SGX__ 1
+// CHECK_SKX_M32-NOT: #define __SGX__ 1
 // CHECK_SKX_M32: #define __SSE2__ 1
 // CHECK_SKX_M32: #define __SSE3__ 1
 // CHECK_SKX_M32: #define __SSE4_1__ 1
@@ -937,7 +937,7 @@
 // CHECK_SKX_M64: #define __RDRND__ 1
 // CHECK_SKX_M64: #define __RDSEED__ 1
 // CHECK_SKX_M64: #define __RTM__ 1
-// CHECK_SKX_M64: #define __SGX__ 1
+// CHECK_SKX_M64-NOT: #define __SGX__ 1
 // CHECK_SKX_M64: #define __SSE2_MATH__ 1
 // CHECK_SKX_M64: #define __SSE2__ 1
 // CHECK_SKX_M64: #define __SSE3__ 1


Index: cfe/trunk/test/Preprocessor/predefined-arch-macros.c
===
--- cfe/trunk/test/Preprocessor/predefined-arch-macros.c
+++ cfe/trunk/test/Preprocessor/predefined-arch-macros.c
@@ -892,7 +892,7 @@
 // CHECK_SKX_M32: #define __RDRND__ 1
 // CHECK_SKX_M32: #define __RDSEED__ 1
 // CHECK_SKX_M32: #define __RTM__ 1
-// CHECK_SKX_M32: #define __SGX__ 1
+// CHECK_SKX_M32-NOT: #define __SGX__ 1
 // CHECK_SKX_M32: #define __SSE2__ 1
 // CHECK_SKX_M32: #define __SSE3__ 1
 // CHECK_SKX_M32: #define __SSE4_1__ 1
@@ -937,7 +937,7 @@
 // CHECK_SKX_M64: #define __RDRND__ 1
 // CHECK_SKX_M64: #define __RDSEED__ 1
 // CHECK_SKX_M64: #define __RTM__ 1
-// CHECK_SKX_M64: #define __SGX__ 1
+// CHECK_SKX_M64-NOT: #define __SGX__ 1
 // CHECK_SKX_M64: #define __SSE2_MATH__ 1
 // CHECK_SKX_M64: #define __SSE2__ 1
 // CHECK_SKX_M64: #define __SSE3__ 1
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45488: [X86] Disable SGX for Skylake Server - CPP test

2018-04-10 Thread Gabor Buella via Phabricator via cfe-commits

GBuella added a comment.

In https://reviews.llvm.org/D45488#1063067, @davezarzycki wrote:

> I don't think this change is correct. To the best of my knowledge, SKL does 
> support SGX but SKX does not.

llvm-lit test/Preprocessor/predefined-arch-macros.c passes now.
I didn't know tests are not run automatically precommit around here, now I know 
I must always check them manually... sorry

https://reviews.llvm.org/D45488

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45488: [X86] Disable SGX for Skylake Server - CPP test

2018-04-10 Thread Gabor Buella via Phabricator via cfe-commits

GBuella updated this revision to Diff 141853.

https://reviews.llvm.org/D45488

Files:
  test/Preprocessor/predefined-arch-macros.c


Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -892,7 +892,7 @@
 // CHECK_SKX_M32: #define __RDRND__ 1
 // CHECK_SKX_M32: #define __RDSEED__ 1
 // CHECK_SKX_M32: #define __RTM__ 1
-// CHECK_SKX_M32: #define __SGX__ 1
+// CHECK_SKX_M32-NOT: #define __SGX__ 1
 // CHECK_SKX_M32: #define __SSE2__ 1
 // CHECK_SKX_M32: #define __SSE3__ 1
 // CHECK_SKX_M32: #define __SSE4_1__ 1
@@ -937,7 +937,7 @@
 // CHECK_SKX_M64: #define __RDRND__ 1
 // CHECK_SKX_M64: #define __RDSEED__ 1
 // CHECK_SKX_M64: #define __RTM__ 1
-// CHECK_SKX_M64: #define __SGX__ 1
+// CHECK_SKX_M64-NOT: #define __SGX__ 1
 // CHECK_SKX_M64: #define __SSE2_MATH__ 1
 // CHECK_SKX_M64: #define __SSE2__ 1
 // CHECK_SKX_M64: #define __SSE3__ 1


Index: test/Preprocessor/predefined-arch-macros.c
===
--- test/Preprocessor/predefined-arch-macros.c
+++ test/Preprocessor/predefined-arch-macros.c
@@ -892,7 +892,7 @@
 // CHECK_SKX_M32: #define __RDRND__ 1
 // CHECK_SKX_M32: #define __RDSEED__ 1
 // CHECK_SKX_M32: #define __RTM__ 1
-// CHECK_SKX_M32: #define __SGX__ 1
+// CHECK_SKX_M32-NOT: #define __SGX__ 1
 // CHECK_SKX_M32: #define __SSE2__ 1
 // CHECK_SKX_M32: #define __SSE3__ 1
 // CHECK_SKX_M32: #define __SSE4_1__ 1
@@ -937,7 +937,7 @@
 // CHECK_SKX_M64: #define __RDRND__ 1
 // CHECK_SKX_M64: #define __RDSEED__ 1
 // CHECK_SKX_M64: #define __RTM__ 1
-// CHECK_SKX_M64: #define __SGX__ 1
+// CHECK_SKX_M64-NOT: #define __SGX__ 1
 // CHECK_SKX_M64: #define __SSE2_MATH__ 1
 // CHECK_SKX_M64: #define __SSE2__ 1
 // CHECK_SKX_M64: #define __SSE3__ 1
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 >

1 - 100 of 124 matches

Mail list logo