Re: [PATCH] D20468: [X86][AVX] Ensure zero-extension of _mm256_extract_epi8 and _mm256_extract_epi16

2016-05-21 Thread Simon Pilgrim via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL270330: [X86][AVX] Ensure zero-extension of 
_mm256_extract_epi8 and _mm256_extract_epi16 (authored by RKSimon).

Changed prior to commit:
  http://reviews.llvm.org/D20468?vs=57927=58045#toc

Repository:
  rL LLVM

http://reviews.llvm.org/D20468

Files:
  cfe/trunk/lib/Headers/avxintrin.h
  cfe/trunk/test/CodeGen/avx-builtins.c

Index: cfe/trunk/test/CodeGen/avx-builtins.c
===
--- cfe/trunk/test/CodeGen/avx-builtins.c
+++ cfe/trunk/test/CodeGen/avx-builtins.c
@@ -314,21 +314,19 @@
   return _mm256_dp_ps(A, B, 7);
 }
 
-// FIXME: ZEXT instead of SEXT
 int test_mm256_extract_epi8(__m256i A) {
   // CHECK-LABEL: test_mm256_extract_epi8
   // CHECK: and i32 %{{.*}}, 31
   // CHECK: extractelement <32 x i8> %{{.*}}, i32 %{{.*}}
-  // CHECK: ext i8 %{{.*}} to i32
+  // CHECK: zext i8 %{{.*}} to i32
   return _mm256_extract_epi8(A, 32);
 }
 
-// FIXME: ZEXT instead of SEXT
 int test_mm256_extract_epi16(__m256i A) {
   // CHECK-LABEL: test_mm256_extract_epi16
   // CHECK: and i32 %{{.*}}, 15
   // CHECK: extractelement <16 x i16> %{{.*}}, i32 %{{.*}}
-  // CHECK: ext i16 %{{.*}} to i32
+  // CHECK: zext i16 %{{.*}} to i32
   return _mm256_extract_epi16(A, 16);
 }
 
Index: cfe/trunk/lib/Headers/avxintrin.h
===
--- cfe/trunk/lib/Headers/avxintrin.h
+++ cfe/trunk/lib/Headers/avxintrin.h
@@ -1875,13 +1875,13 @@
 /// \param __imm
 ///An immediate integer operand with bits [3:0] determining which vector
 ///element is extracted and returned.
-/// \returns A 32-bit integer containing the extracted 16 bits of extended
+/// \returns A 32-bit integer containing the extracted 16 bits of zero extended
 ///packed data.
 static __inline int __DEFAULT_FN_ATTRS
 _mm256_extract_epi16(__m256i __a, const int __imm)
 {
   __v16hi __b = (__v16hi)__a;
-  return __b[__imm & 15];
+  return (unsigned short)__b[__imm & 15];
 }
 
 /// \brief Takes a [32 x i8] vector and returns the vector element value
@@ -1897,13 +1897,13 @@
 /// \param __imm
 ///An immediate integer operand with bits [4:0] determining which vector
 ///element is extracted and returned.
-/// \returns A 32-bit integer containing the extracted 8 bits of extended 
packed
-///data.
+/// \returns A 32-bit integer containing the extracted 8 bits of zero extended
+///packed data.
 static __inline int __DEFAULT_FN_ATTRS
 _mm256_extract_epi8(__m256i __a, const int __imm)
 {
   __v32qi __b = (__v32qi)__a;
-  return __b[__imm & 31];
+  return (unsigned char)__b[__imm & 31];
 }
 
 #ifdef __x86_64__


Index: cfe/trunk/test/CodeGen/avx-builtins.c
===
--- cfe/trunk/test/CodeGen/avx-builtins.c
+++ cfe/trunk/test/CodeGen/avx-builtins.c
@@ -314,21 +314,19 @@
   return _mm256_dp_ps(A, B, 7);
 }
 
-// FIXME: ZEXT instead of SEXT
 int test_mm256_extract_epi8(__m256i A) {
   // CHECK-LABEL: test_mm256_extract_epi8
   // CHECK: and i32 %{{.*}}, 31
   // CHECK: extractelement <32 x i8> %{{.*}}, i32 %{{.*}}
-  // CHECK: ext i8 %{{.*}} to i32
+  // CHECK: zext i8 %{{.*}} to i32
   return _mm256_extract_epi8(A, 32);
 }
 
-// FIXME: ZEXT instead of SEXT
 int test_mm256_extract_epi16(__m256i A) {
   // CHECK-LABEL: test_mm256_extract_epi16
   // CHECK: and i32 %{{.*}}, 15
   // CHECK: extractelement <16 x i16> %{{.*}}, i32 %{{.*}}
-  // CHECK: ext i16 %{{.*}} to i32
+  // CHECK: zext i16 %{{.*}} to i32
   return _mm256_extract_epi16(A, 16);
 }
 
Index: cfe/trunk/lib/Headers/avxintrin.h
===
--- cfe/trunk/lib/Headers/avxintrin.h
+++ cfe/trunk/lib/Headers/avxintrin.h
@@ -1875,13 +1875,13 @@
 /// \param __imm
 ///An immediate integer operand with bits [3:0] determining which vector
 ///element is extracted and returned.
-/// \returns A 32-bit integer containing the extracted 16 bits of extended
+/// \returns A 32-bit integer containing the extracted 16 bits of zero extended
 ///packed data.
 static __inline int __DEFAULT_FN_ATTRS
 _mm256_extract_epi16(__m256i __a, const int __imm)
 {
   __v16hi __b = (__v16hi)__a;
-  return __b[__imm & 15];
+  return (unsigned short)__b[__imm & 15];
 }
 
 /// \brief Takes a [32 x i8] vector and returns the vector element value
@@ -1897,13 +1897,13 @@
 /// \param __imm
 ///An immediate integer operand with bits [4:0] determining which vector
 ///element is extracted and returned.
-/// \returns A 32-bit integer containing the extracted 8 bits of extended packed
-///data.
+/// \returns A 32-bit integer containing the extracted 8 bits of zero extended
+///packed data.
 static __inline int __DEFAULT_FN_ATTRS
 _mm256_extract_epi8(__m256i __a, const int __imm)
 {
   __v32qi __b = (__v32qi)__a;
-  return __b[__imm & 31];
+  return (unsigned char)__b[__imm & 31];
 }
 
 #ifdef 

Re: [PATCH] D20468: [X86][AVX] Ensure zero-extension of _mm256_extract_epi8 and _mm256_extract_epi16

2016-05-20 Thread Michael Kuperstein via cfe-commits
mkuper accepted this revision.
mkuper added a comment.
This revision is now accepted and ready to land.

Thanks, Dave!

In that case, LGTM.


Repository:
  rL LLVM

http://reviews.llvm.org/D20468



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D20468: [X86][AVX] Ensure zero-extension of _mm256_extract_epi8 and _mm256_extract_epi16

2016-05-20 Thread David Kreitzer via cfe-commits
DavidKreitzer added a subscriber: DavidKreitzer.
DavidKreitzer added a comment.

Hi Michael,

I think the Intel Intrinsics reference and the Intel Compiler are in error and 
that this is the right fix for the LLVM headers. (I'll follow up to get the 
Intel Intrinsics reference & Intel Compiler fixed.)

The _mm256_extract_epiN "convenience" intrinsics were first introduced by gcc 
and used an "int" return type. They were added to the Intel Compiler about 2 
years ago, but for some reason were defined to use the smaller signed types. 
I'm double checking with the developer that added them, but I think it was just 
a mistake.

-Dave


Repository:
  rL LLVM

http://reviews.llvm.org/D20468



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D20468: [X86][AVX] Ensure zero-extension of _mm256_extract_epi8 and _mm256_extract_epi16

2016-05-20 Thread Michael Kuperstein via cfe-commits
mkuper added a comment.

You're right, the underlying instructions zext, and it seems like it's the 
right thing to do. I'm just wondering if this is something user code is 
supposed to rely on, given the way the intrinsics guide documents them right 
now.
H.J, could you please take a look?


Repository:
  rL LLVM

http://reviews.llvm.org/D20468



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D20468: [X86][AVX] Ensure zero-extension of _mm256_extract_epi8 and _mm256_extract_epi16

2016-05-20 Thread Simon Pilgrim via cfe-commits
RKSimon added a comment.

In http://reviews.llvm.org/D20468#435522, @mkuper wrote:

> Could you point me to where in the documentation it says they must be 
> zero-extended?
>  The Intel intrinsics guide actually has them with shorter return types:
>
>   __int8 _mm256_extract_epi8 (__m256i a, const int index)
>   __int16 _mm256_extract_epi16 (__m256i a, const int index)


And the gcc version has them wrapped to the _mm_extract_epi* intrinsics which 
map to the real 128-bit instructions which do zero-extend.

I'm open to changing the return types in the headers instead, but really I'd 
expect the mm256 versions to zero extend like the older mm versions.


Repository:
  rL LLVM

http://reviews.llvm.org/D20468



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D20468: [X86][AVX] Ensure zero-extension of _mm256_extract_epi8 and _mm256_extract_epi16

2016-05-20 Thread Michael Kuperstein via cfe-commits
mkuper added a comment.

Could you point me to where in the documentation it says they must be 
zero-extended?
The Intel intrinsics guide actually has them with shorter return types:

  __int8 _mm256_extract_epi8 (__m256i a, const int index)
  __int16 _mm256_extract_epi16 (__m256i a, const int index)


Repository:
  rL LLVM

http://reviews.llvm.org/D20468



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits