[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-14 Thread Simon Pilgrim via Phabricator via cfe-commits
RKSimon added inline comments.



Comment at: clang/test/CodeGen/X86/avxneconvert-builtins.c:2
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown 
-target-feature +avx2 -target-feature +avxneconvert \
+// RUN: -target-feature +avx512fp16 -emit-llvm -o - -Wall -Werror -pedantic 
-Wno-gnu-statement-expression | FileCheck %s
+

32-bit test coverage?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-17 Thread Freddy, Ye via Phabricator via cfe-commits
FreddyYe marked 5 inline comments as done.
FreddyYe added a comment.

THX for reviews!




Comment at: clang/lib/Headers/immintrin.h:257
 
+/* FIXME: Change these When _Float16 type is supported */
+#if defined(__AVXNECONVERT__) && defined(__AVX512FP16__)

pengfei wrote:
> craig.topper wrote:
> > Is this FIXME still relevant? Don't we support _Float16 with SSE2 now?
> _Float16 is supported with SSE2, but maybe we need to move `__m128h`, 
> `__m256h` out of avx512fp16intrin.h
Yes. This is a redundant FIXME.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-17 Thread Simon Pilgrim via Phabricator via cfe-commits
RKSimon added inline comments.



Comment at: llvm/test/MC/X86/avx-ne-convert-att.s:1
+// RUN: llvm-mc -triple i686-unknown-unknown --show-encoding %s | FileCheck %s
+

merge the att + intel test files and use --check-prefixes to test both 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-18 Thread Simon Pilgrim via Phabricator via cfe-commits
RKSimon added a comment.

merge att/intel test coverage files and rename the 32/64 bit files so that they 
are close together in the file lists


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-19 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added inline comments.



Comment at: clang/lib/Headers/immintrin.h:257
 
+/* FIXME: Change these When _Float16 type is supported */
+#if defined(__AVXNECONVERT__) && defined(__AVX512FP16__)

FreddyYe wrote:
> pengfei wrote:
> > craig.topper wrote:
> > > Is this FIXME still relevant? Don't we support _Float16 with SSE2 now?
> > _Float16 is supported with SSE2, but maybe we need to move `__m128h`, 
> > `__m256h` out of avx512fp16intrin.h
> Yes. This is a redundant FIXME.
I have moved FP16/BF16 vector types out of original header files. rGe0fb01e9
There should be no dependency to FP16 and BF16 feature now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-19 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added inline comments.



Comment at: clang/test/CodeGen/X86/avxneconvert-builtins.c:2
+// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown 
-target-feature +avx2 -target-feature +avxneconvert \
+// RUN: -target-feature +avx512fp16 -emit-llvm -o - -Wall -Werror -pedantic 
-Wno-gnu-statement-expression | FileCheck %s
+// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown  
-target-feature +avx2 -target-feature +avxneconvert \

This should be removed now.



Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+avx512fp16,+avxneconvert | FileCheck %s 
--check-prefixes=X64
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+avx512fp16,+avxneconvert | FileCheck %s 
--check-prefixes=X86

Do we have real dependency to FP16?



Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll:3
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+avx512fp16,+avxneconvert | FileCheck %s 
--check-prefixes=X64
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+avx512fp16,+avxneconvert | FileCheck %s 
--check-prefixes=X86
+

ditto.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-25 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added inline comments.



Comment at: clang/include/clang/Basic/BuiltinsX86.def:2106-2107
+TARGET_BUILTIN(__builtin_ia32_vcvtneoph2ps256, "V8fV16xC*", "nV:256:", 
"avxneconvert")
+TARGET_BUILTIN(__builtin_ia32_vcvtneps2bf16128, "V8sV4f", "nV:128:", 
"avxneconvert")
+TARGET_BUILTIN(__builtin_ia32_vcvtneps2bf16256, "V8sV8f", "nV:256:", 
"avxneconvert")
 TARGET_HEADER_BUILTIN(_InterlockedAnd64, "WiWiD*Wi", "nh", "intrin.h", 
ALL_MS_LANGUAGES, "")

These should be shared with AVX512-BF16.



Comment at: clang/lib/Headers/avxneconvertintrin.h:86-94
+static __inline__ __m128bh __DEFAULT_FN_ATTRS128
+_mm_cvtneps_avx_pbh(__m128 __A) {
+  return (__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)__A);
+}
+
+static __inline__ __m128bh __DEFAULT_FN_ATTRS256
+_mm256_cvtneps_avx_pbh(__m256 __A) {

Add unified intrinsics like AVXVNNI.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-26 Thread Simon Pilgrim via Phabricator via cfe-commits
RKSimon added a comment.

Possibly rename the x86-64-* test files to *-64 (and *-32 equivalent) so that 
the 32/64 bit files are closer together for tracking (and to help avoid bitrot).




Comment at: clang/lib/Headers/immintrin.h:257
 
+/* FIXME: Change these When _Float16 type is supported */
+#if defined(__AVXNECONVERT__) && defined(__AVX512FP16__)

pengfei wrote:
> FreddyYe wrote:
> > pengfei wrote:
> > > craig.topper wrote:
> > > > Is this FIXME still relevant? Don't we support _Float16 with SSE2 now?
> > > _Float16 is supported with SSE2, but maybe we need to move `__m128h`, 
> > > `__m256h` out of avx512fp16intrin.h
> > Yes. This is a redundant FIXME.
> I have moved FP16/BF16 vector types out of original header files. rGe0fb01e9
> There should be no dependency to FP16 and BF16 feature now.
Update to this?
```
#if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
(defined(__AVXNECONVERT__) && defined(__AVX512FP16__))
```



Comment at: llvm/test/MC/X86/x86-64-avx-ne-convert-att.s:1
+// RUN: llvm-mc -triple x86_64-unknown-unknown --show-encoding %s | FileCheck 
%s
+

x86-64-avx-ne-convert-intel.s ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-28 Thread Freddy, Ye via Phabricator via cfe-commits
FreddyYe added inline comments.



Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll:4
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+avxneconvert | FileCheck %s --check-prefixes=X86
+
+define <4 x float> @test_int_x86_vbcstnebf162ps128(i8* %A) {

Need to add `+avx512bf16,+avx512vl` tests for shared builtin intrinsic. I just 
found it crashed for lacking new patterns for avx512bf16. I'll update ASAP.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-28 Thread Simon Pilgrim via Phabricator via cfe-commits
RKSimon added inline comments.



Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164
+#define _mm_cvtneps_pbh(A) \
+  ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
 

Is there no way for __attribute__ to allow different attribute permutations?

Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-28 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added inline comments.



Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164
+#define _mm_cvtneps_pbh(A) \
+  ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
 

RKSimon wrote:
> Is there no way for __attribute__ to allow different attribute permutations?
> 
> Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention?
> Is there no way for attribute to allow different attribute permutations?

We have discussed this problem with GCC folks. There are two problems here:
1. Unlike builtins, function attributes are more generic. It may introduce a 
lot of checks between callers and callees. I had a research to limit it to 
`__always_inline__` functions only. However, Clang handles inlining in 
middle-end, we don't have such information in the front-end. Besides, we don't 
know how to merge different permutations if they are inlining to the same 
function.
2. We don't know how to put the permutations into IR's function attributes. We 
need to preserve all permutations for inlining reference, but the backend needs 
a determine feature list rather than selective.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-29 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added inline comments.



Comment at: clang/include/clang/Driver/Options.td:4595-4596
 def mno_avxvnniint8 : Flag<["-"], "mno-avxvnniint8">, 
Group;
+def mavxneconvert : Flag<["-"], "mavxneconvert">, Group;
+def mno_avxneconvert : Flag<["-"], "mno-avxneconvert">, 
Group;
 def mavxvnni : Flag<["-"], "mavxvnni">, Group;

Need to move it before `mavxvnniint8 `.



Comment at: clang/lib/Basic/Targets/X86.cpp:1033
   .Case("avxvnniint8", HasAVXVNNIINT8)
+  .Case("avxneconvert", HasAVXNECONVERT)
   .Case("bmi", HasBMI)

Move it ahead.



Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164
+#define _mm_cvtneps_pbh(A) \
+  ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
 

pengfei wrote:
> RKSimon wrote:
> > Is there no way for __attribute__ to allow different attribute permutations?
> > 
> > Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention?
> > Is there no way for attribute to allow different attribute permutations?
> 
> We have discussed this problem with GCC folks. There are two problems here:
> 1. Unlike builtins, function attributes are more generic. It may introduce a 
> lot of checks between callers and callees. I had a research to limit it to 
> `__always_inline__` functions only. However, Clang handles inlining in 
> middle-end, we don't have such information in the front-end. Besides, we 
> don't know how to merge different permutations if they are inlining to the 
> same function.
> 2. We don't know how to put the permutations into IR's function attributes. 
> We need to preserve all permutations for inlining reference, but the backend 
> needs a determine feature list rather than selective.
It's better to use `__builtin_ia32_cvtneps2bf16_128`.



Comment at: clang/lib/Headers/avxneconvertintrin.h:106
+///
+/// This intrinsic corresponds to the \c VBCSTNEBF162PS instruction.
+///

VBCSTNESH2PS



Comment at: clang/lib/Headers/avxneconvertintrin.h:139
+///
+/// This intrinsic corresponds to the \c VBCSTNEBF162PS instruction.
+///

VBCSTNESH2PS



Comment at: clang/lib/Headers/avxneconvertintrin.h:207
+/// \param __A
+///A pointer to a 256-bit memory location containing 8 consecutive
+///BF16 (16-bit) floating-point values.

16



Comment at: clang/lib/Headers/avxneconvertintrin.h:273
+/// \param __A
+///A pointer to a 256-bit memory location containing 8 consecutive
+///half-precision (16-bit) floating-point values.

16



Comment at: clang/lib/Headers/avxneconvertintrin.h:339
+/// \param __A
+///A pointer to a 256-bit memory location containing 8 consecutive
+///BF16 (16-bit) floating-point values.

16



Comment at: clang/lib/Headers/avxneconvertintrin.h:405
+/// \param __A
+///A pointer to a 256-bit memory location containing 8 consecutive
+///half-precision (16-bit) floating-point values.

16



Comment at: clang/test/Preprocessor/x86_target_features.c:593-599
+// RUN: %clang -target i686-unknown-linux-gnu -march=atom -mavxneconvert -x c 
-E -dM -o - %s | FileCheck  -check-prefix=AVXNECONVERT %s
+// AVXNECONVERT: #define __AVXNECONVERT__ 1
+
+// RUN: %clang -target i686-unknown-linux-gnu -march=atom -mno-avxneconvert -x 
c -E -dM -o - %s | FileCheck  -check-prefix=NO-AVXNECONVERT %s
+// RUN: %clang -target i686-unknown-linux-gnu -march=atom -mavxneconvert 
-mno-avx2 -x c -E -dM -o - %s | FileCheck  -check-prefix=NO-AVXNECONVERT %s
+
+// NO-AVXNECONVERT-NOT: #define __AVXNECONVERT__ 1

Should we check `__AVX2__` like we did for AVXVNNI?



Comment at: llvm/lib/Support/Host.cpp:1818
 
+  Features["avxneconvert"] = HasLeaf7Subleaf1 && ((EDX >> 5) & 1) && 
HasAVXSave;
+

Move it ahead and remove the blank line.



Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:2186-2203
   if (!Subtarget.useSoftFloat() && Subtarget.hasBF16()) {
 addRegisterClass(MVT::v8bf16, &X86::VR128XRegClass);
 addRegisterClass(MVT::v16bf16, &X86::VR256XRegClass);
 addRegisterClass(MVT::v32bf16, &X86::VR512RegClass);
 // We set the type action of bf16 to TypeSoftPromoteHalf, but we don't
 // provide the method to promote BUILD_VECTOR. Set the operation action
 // Custom to do the customization later.

How about merge it here?



Comment at: llvm/lib/Target/X86/X86InstrSSE.td:8260-8261
+let Predicates = [HasAVXNECONVERT] in {
+  defm VBCSTNEBF162PS : AVX_NE_CONVERT_BASE<0xb1, "vbcstnebf162ps", i16mem,
+   i16mem>, T8XS;
+  defm VBCSTNESH2PS : AVX_NE_CONVERT_BASE<0xb1, "vbcstnesh2ps", f16mem, 
f16mem>,

This can be f16 mem now.



Comment at: llvm/lib/Target/X86/X86InstrSSE.td:8264-8265
+  

[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-31 Thread Freddy, Ye via Phabricator via cfe-commits
FreddyYe marked an inline comment as done.
FreddyYe added inline comments.



Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164
+#define _mm_cvtneps_pbh(A) \
+  ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
 

pengfei wrote:
> pengfei wrote:
> > RKSimon wrote:
> > > Is there no way for __attribute__ to allow different attribute 
> > > permutations?
> > > 
> > > Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention?
> > > Is there no way for attribute to allow different attribute permutations?
> > 
> > We have discussed this problem with GCC folks. There are two problems here:
> > 1. Unlike builtins, function attributes are more generic. It may introduce 
> > a lot of checks between callers and callees. I had a research to limit it 
> > to `__always_inline__` functions only. However, Clang handles inlining in 
> > middle-end, we don't have such information in the front-end. Besides, we 
> > don't know how to merge different permutations if they are inlining to the 
> > same function.
> > 2. We don't know how to put the permutations into IR's function attributes. 
> > We need to preserve all permutations for inlining reference, but the 
> > backend needs a determine feature list rather than selective.
> It's better to use `__builtin_ia32_cvtneps2bf16_128`.
I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name.

See __builtin_ia32_vfmaddsubph256, __builtin_ia32_minph256...

And I admit naming conventions of clang builtins as well as LLVM IR builtins 
are confusing right now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-31 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added inline comments.



Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164
+#define _mm_cvtneps_pbh(A) \
+  ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
 

FreddyYe wrote:
> pengfei wrote:
> > pengfei wrote:
> > > RKSimon wrote:
> > > > Is there no way for __attribute__ to allow different attribute 
> > > > permutations?
> > > > 
> > > > Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention?
> > > > Is there no way for attribute to allow different attribute permutations?
> > > 
> > > We have discussed this problem with GCC folks. There are two problems 
> > > here:
> > > 1. Unlike builtins, function attributes are more generic. It may 
> > > introduce a lot of checks between callers and callees. I had a research 
> > > to limit it to `__always_inline__` functions only. However, Clang handles 
> > > inlining in middle-end, we don't have such information in the front-end. 
> > > Besides, we don't know how to merge different permutations if they are 
> > > inlining to the same function.
> > > 2. We don't know how to put the permutations into IR's function 
> > > attributes. We need to preserve all permutations for inlining reference, 
> > > but the backend needs a determine feature list rather than selective.
> > It's better to use `__builtin_ia32_cvtneps2bf16_128`.
> I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name.
> 
> See __builtin_ia32_vfmaddsubph256, __builtin_ia32_minph256...
> 
> And I admit naming conventions of clang builtins as well as LLVM IR builtins 
> are confusing right now.
The problem here is `16128` is a bit confusing, a `_` breaks it into 2 number.
But I'm not insist on it :)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-31 Thread Freddy, Ye via Phabricator via cfe-commits
FreddyYe marked an inline comment as done.
FreddyYe added inline comments.



Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164
+#define _mm_cvtneps_pbh(A) \
+  ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
 

pengfei wrote:
> FreddyYe wrote:
> > pengfei wrote:
> > > pengfei wrote:
> > > > RKSimon wrote:
> > > > > Is there no way for __attribute__ to allow different attribute 
> > > > > permutations?
> > > > > 
> > > > > Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming 
> > > > > convention?
> > > > > Is there no way for attribute to allow different attribute 
> > > > > permutations?
> > > > 
> > > > We have discussed this problem with GCC folks. There are two problems 
> > > > here:
> > > > 1. Unlike builtins, function attributes are more generic. It may 
> > > > introduce a lot of checks between callers and callees. I had a research 
> > > > to limit it to `__always_inline__` functions only. However, Clang 
> > > > handles inlining in middle-end, we don't have such information in the 
> > > > front-end. Besides, we don't know how to merge different permutations 
> > > > if they are inlining to the same function.
> > > > 2. We don't know how to put the permutations into IR's function 
> > > > attributes. We need to preserve all permutations for inlining 
> > > > reference, but the backend needs a determine feature list rather than 
> > > > selective.
> > > It's better to use `__builtin_ia32_cvtneps2bf16_128`.
> > I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name.
> > 
> > See __builtin_ia32_vfmaddsubph256, __builtin_ia32_minph256...
> > 
> > And I admit naming conventions of clang builtins as well as LLVM IR 
> > builtins are confusing right now.
> The problem here is `16128` is a bit confusing, a `_` breaks it into 2 number.
> But I'm not insist on it :)
I did a try but found __builtin_ia32_cvtneps2bf16_256 existed for avx512bf16, 
and it's used for mask intrinsic lowering currently. What about not change this 
time? We can do a refine patch later for avx512bf16 builtins since they also 
have some redundant FE/codegen logics for 256/512 mask intrinsics.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-31 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added a comment.

LGTM.




Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164
+#define _mm_cvtneps_pbh(A) \
+  ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
 

FreddyYe wrote:
> pengfei wrote:
> > FreddyYe wrote:
> > > pengfei wrote:
> > > > pengfei wrote:
> > > > > RKSimon wrote:
> > > > > > Is there no way for __attribute__ to allow different attribute 
> > > > > > permutations?
> > > > > > 
> > > > > > Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming 
> > > > > > convention?
> > > > > > Is there no way for attribute to allow different attribute 
> > > > > > permutations?
> > > > > 
> > > > > We have discussed this problem with GCC folks. There are two problems 
> > > > > here:
> > > > > 1. Unlike builtins, function attributes are more generic. It may 
> > > > > introduce a lot of checks between callers and callees. I had a 
> > > > > research to limit it to `__always_inline__` functions only. However, 
> > > > > Clang handles inlining in middle-end, we don't have such information 
> > > > > in the front-end. Besides, we don't know how to merge different 
> > > > > permutations if they are inlining to the same function.
> > > > > 2. We don't know how to put the permutations into IR's function 
> > > > > attributes. We need to preserve all permutations for inlining 
> > > > > reference, but the backend needs a determine feature list rather than 
> > > > > selective.
> > > > It's better to use `__builtin_ia32_cvtneps2bf16_128`.
> > > I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name.
> > > 
> > > See __builtin_ia32_vfmaddsubph256, __builtin_ia32_minph256...
> > > 
> > > And I admit naming conventions of clang builtins as well as LLVM IR 
> > > builtins are confusing right now.
> > The problem here is `16128` is a bit confusing, a `_` breaks it into 2 
> > number.
> > But I'm not insist on it :)
> I did a try but found __builtin_ia32_cvtneps2bf16_256 existed for avx512bf16, 
> and it's used for mask intrinsic lowering currently. What about not change 
> this time? We can do a refine patch later for avx512bf16 builtins since they 
> also have some redundant FE/codegen logics for 256/512 mask intrinsics.
No problem.



Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics-shared.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+avxneconvert,+avx512bf16,+avx512vl | FileCheck %s 
--check-prefix=AVX512BF16-COMMON
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+avxneconvert,+avx512bf16,+avx512vl | FileCheck %s 
--check-prefix=AVX512BF16-COMMON

Remove `-O0`



Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown 
--show-mc-encoding -mattr=+avxneconvert | FileCheck %s 
--check-prefixes=CHECK,X64
+; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown 
--show-mc-encoding -mattr=+avxneconvert | FileCheck %s 
--check-prefixes=CHECK,X86

ditto.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-13 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments.



Comment at: clang/lib/Basic/Targets/X86.cpp:781
+Builder.defineMacro("__AVXNECONVERT__");
+  Builder.defineMacro("__AVXNECONVERT_SUPPORTED__");
   if (HasAVXVNNI)

Do we need it here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-13 Thread Craig Topper via Phabricator via cfe-commits
craig.topper added inline comments.



Comment at: clang/lib/Headers/immintrin.h:257
 
+/* FIXME: Change these When _Float16 type is supported */
+#if defined(__AVXNECONVERT__) && defined(__AVX512FP16__)

Is this FIXME still relevant? Don't we support _Float16 with SSE2 now?



Comment at: llvm/include/llvm/Support/X86TargetParser.def:204
 X86_FEATURE   (AVX512FP16,  "avx512fp16")
+X86_FEATURE   (AVXNECONVERT,  "avxneconvert")
 X86_FEATURE   (AVXVNNI, "avxvnni")

Extra space before "avxneconvert"


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-13 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added inline comments.



Comment at: clang/lib/Basic/Targets/X86.cpp:781
+Builder.defineMacro("__AVXNECONVERT__");
+  Builder.defineMacro("__AVXNECONVERT_SUPPORTED__");
   if (HasAVXVNNI)

LuoYuanke wrote:
> Do we need it here?
We don't need it.



Comment at: clang/lib/Headers/immintrin.h:257
 
+/* FIXME: Change these When _Float16 type is supported */
+#if defined(__AVXNECONVERT__) && defined(__AVX512FP16__)

craig.topper wrote:
> Is this FIXME still relevant? Don't we support _Float16 with SSE2 now?
_Float16 is supported with SSE2, but maybe we need to move `__m128h`, `__m256h` 
out of avx512fp16intrin.h


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-13 Thread Phoebe Wang via Phabricator via cfe-commits
pengfei added inline comments.



Comment at: clang/lib/Headers/avxneconvertintrin.h:47
+static __inline__ __m128 __DEFAULT_FN_ATTRS128
+_mm_cvtneebf16_ps(const __m128bh *__A) {
+  return (__m128)__builtin_ia32_vcvtneebf162ps128((const __v8hi *)__A);

I think the bf16 vector type may have the same problem with FP16. When need to 
move them out of avx512vlbf16intrin.h
Another issue is we want to switch them to `__bf16` vector. Hope D132329 can be 
landed first.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits