[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2021-06-08 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune created this revision.
aqjune added reviewers: nikic, efriedma, spatel, fhahn, lebedev.ri, RKSimon.
Herald added a reviewer: deadalnix.
Herald added subscribers: dexonsmith, kerbowa, pengfei, dmgreen, zzheng, 
kbarton, hiraditya, nhaehnle, jvesely, nemanjai.
aqjune requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

This is a patch that renames shufflevector's undef mask to poison.

By D93818 , shufflevector's undef mask isn't 
undef anymore; it returns poison instead.

  %v = shufflevector <2 x i8> %x, <2 x i8> %y, <2 x i8> 
  ; %v[0] = %x[0]
  ; %v[1] = poison

Since poison is more undefined than undef, this validates many existing 
transformations that we wanted to support.
Also, this allows more aggressive optimizations because poison is more 
propagative (e.g. poison & 0 = poison whereas undef & 0 != undef).

This patch updates shufflevector mask's printed string to be `poison` to match 
its new semantics.

This has changes in clang tests as well.
They are mainly about vector intrinsics being lowered into `shufflevector`.
The unused elements were filled with `undef` previously, but with this patch 
they are filled with `poison`.
Since they are unused elements anyway, I believe this isn't a functional change 
in fact.
But, I'm happy with this being double-checked by someone who works on these 
intrinsics as well.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D103874

Files:
  clang/test/CodeGen/X86/avx-builtins.c
  clang/test/CodeGen/X86/avx512f-builtins.c
  clang/test/CodeGen/aarch64-bf16-lane-intrinsics.c
  clang/test/CodeGen/builtins-ppc-p10vector.c
  clang/test/CodeGen/builtins-ppc-p9vector.c
  clang/test/CodeGen/builtinshufflevector2.c
  clang/test/CodeGen/ext-vector.c
  clang/test/CodeGenOpenCL/as_type.cl
  clang/test/CodeGenOpenCL/partial_initializer.cl
  clang/test/CodeGenOpenCL/preserve_vec3.cl
  clang/test/CodeGenOpenCL/vector_literals.cl
  clang/test/CodeGenOpenCL/vector_shufflevector.cl
  llvm/include/llvm-c/Core.h
  llvm/include/llvm/IR/Instructions.h
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/lib/Analysis/ValueTracking.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ConstantFold.cpp
  llvm/lib/IR/Core.cpp
  llvm/lib/IR/Instructions.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
  llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
  llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
  llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
  llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
  llvm/lib/Transforms/Vectorize/VectorCombine.cpp
  llvm/test/Analysis/CostModel/AMDGPU/shufflevector.ll
  llvm/test/Analysis/CostModel/X86/reduction.ll
  llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector.ll
  llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector.ll
  llvm/test/Analysis/CostModel/X86/shuffle-single-src.ll
  llvm/test/CodeGen/AMDGPU/rewrite-out-arguments.ll
  llvm/test/CodeGen/Generic/expand-experimental-reductions.ll
  llvm/test/CodeGen/PowerPC/arg_promotion.ll
  llvm/test/Transforms/CodeGenPrepare/X86/x86-shuffle-sink-inseltpoison.ll
  llvm/test/Transforms/CodeGenPrepare/X86/x86-shuffle-sink.ll
  llvm/test/Transforms/DeadStoreElimination/masked-dead-store-inseltpoison.ll
  llvm/test/Transforms/DeadStoreElimination/masked-dead-store.ll
  
llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts-inseltpoison.ll
  llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll
  llvm/test/Transforms/InstCombine/X86/x86-addsub-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-addsub.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx2-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx2.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx512-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx512.ll
  llvm/test/Transforms/InstCombine/X86/x86-muldq-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-muldq.ll
  llvm/test/Transforms/InstCombine/X86/x86-pack-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-pack.ll
  llvm/test/Transforms/InstCombine/X86/x86-pshufb-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-pshufb.ll
  llvm/test/Transforms/InstCombine/X86/x86-sse4a-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-sse4a.ll
  llvm/test/Transforms/InstCombine/X86/x86-vpermil-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-vpermil.ll
  llvm/test/Transforms/InstCombine/broadcast-inseltpoison.ll
  llvm/test/Transforms/InstCombine/broadcast.ll
  llvm/test/Transforms/InstCombine/bswap-inseltpoison.ll
  llvm/test/Transforms/InstCombine/bswap.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll
  llvm/test/Transforms/InstCombine/extractelement-inseltpoison.ll
  llvm/test/Transforms/InstCombine/extractelem

[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2021-06-08 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

I noted the cases where it looks like the undef->poison change might actually 
impact code using compiler intrinsic functions that have external 
specifications.  The relevant specifications say the elements in question are 
"undefined", without really specifying what that means.

Currently, for the Intel intrinsics, we treat "undefined" as something more 
conservative than LLVM undef; see 
https://github.com/llvm/llvm-project/blob/d2012d965d60c3258b3a69d024491698f8aec386/clang/lib/CodeGen/CGBuiltin.cpp#L12483
 .  Maybe we should make the cast intrinsics more conservative to match.  And 
maybe we should do the same for OpenCL.  Would need to do some backend work to 
make sure we don't regress the generated code, though.

For __builtin_shufflevector, I think I'm fine with this changing the "-1" to 
mean poison; we don't have any external spec to conform to, and anyone 
explicitly passing -1 should know what they're doing.  But maybe worth noting 
in the clang documentation.




Comment at: clang/test/CodeGen/X86/avx-builtins.c:182
   // CHECK-LABEL: test_mm256_castsi128_si256
-  // CHECK: shufflevector <2 x i64> %{{.*}}, <2 x i64> %{{.*}}, <4 x i32> 
+  // CHECK: shufflevector <2 x i64> %{{.*}}, <2 x i64> %{{.*}}, <4 x i32> 
   return _mm256_castsi128_si256(A);

This change might be visible to user code.



Comment at: clang/test/CodeGen/builtinshufflevector2.c:41
 // CHECK: store <4 x float> [[V]], <4 x float>* {{%.*}}
   *A = __builtin_shufflevector( x, y, 0, 4, -1, 5 );
 }

This might be visible to user code.



Comment at: clang/test/CodeGenOpenCL/preserve_vec3.cl:21
   // CHECK: %[[LOAD_A:.*]] = load <3 x float>, <3 x float> addrspace(1)* %a, 
align 16
-  // CHECK: %[[ASTYPE:.*]] = shufflevector <3 x float> %[[LOAD_A]], <3 x 
float> poison, <4 x i32> 
+  // CHECK: %[[ASTYPE:.*]] = shufflevector <3 x float> %[[LOAD_A]], <3 x 
float> poison, <4 x i32> 
   // CHECK: store <4 x float> %[[ASTYPE:.*]], <4 x float> addrspace(1)* %b, 
align 16

This change might be visible to user code.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2021-06-08 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune added a comment.

In D103874#2806519 , @efriedma wrote:

> I noted the cases where it looks like the undef->poison change might actually 
> impact code using compiler intrinsic functions that have external 
> specifications.  The relevant specifications say the elements in question are 
> "undefined", without really specifying what that means.
>
> Currently, for the Intel intrinsics, we treat "undefined" as something more 
> conservative than LLVM undef; see 
> https://github.com/llvm/llvm-project/blob/d2012d965d60c3258b3a69d024491698f8aec386/clang/lib/CodeGen/CGBuiltin.cpp#L12483
>  .  Maybe we should make the cast intrinsics more conservative to match.  And 
> maybe we should do the same for OpenCL.  Would need to do some backend work 
> to make sure we don't regress the generated code, though.

Makes sense, the PR (https://llvm.org/PR32176) that is left at the comment says 
it should be something like `freeze poison` as well. (BTW, this means the 
current shufflevector lowering is already incorrect as well..)

Then, _mm256_castsi128_si256 should be lowered into something like this:

  %fr = freeze <2 x i64> poison
  shufflevector <2 x i64> %x, <2 x i64> %fr, <4 x i32> 

BTW, the Intel intrinsic guide for _mm256_castsi128_si256 ( 
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_castsi128_si256&expand=628
 ) says:

  This intrinsic is only used for compilation and does not generate any 
instructions, thus it has zero latency.

We should teach the backend to understand the shufflevector+freeze form and 
lower it into an efficient assembly.

But, in practice, the frozen element won't be used in most of the cases; the 
middle-end's demanded elements analysis will trigger instcombine to almost 
always remove the freeze.
What do you think? If people agree, I'll create a separate patch that lowers 
this to the new freeze+shufflevector format (since it is already incorrect).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2021-06-08 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

> In practice, the frozen element won't be used in most of the cases; the 
> middle-end's demanded elements analysis will trigger instcombine to almost 
> always remove the freeze.

Well, in the cases it gets removed, it doesn't really matter what we use.  It's 
likely if someone is reaching for these more obscure constructs, they're seeing 
that the compiler isn't doing what they want with more normal code, though.

> What do you think? If people agree with the shufflevector+freeze lowering, 
> I'll create a separate patch that lowers this to the new freeze+shufflevector 
> format (since it is already incorrect).

Using "freeze poison" seems reasonable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2022-06-19 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune added a comment.
Herald added subscribers: jsji, kosarev.
Herald added a project: All.

It seems llvm/lib/Target/X86/X86ISelLowering.cpp's LowerAVXCONCAT_VECTORS is 
relevant to efficient lowering of `shufflevector %x, freeze(poison), mask`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2022-06-27 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune added a comment.

In D103874#3594617 , @aqjune wrote:

> It seems llvm/lib/Target/X86/X86ISelLowering.cpp's LowerAVXCONCAT_VECTORS is 
> relevant to efficient lowering of `shufflevector %x, freeze(poison), mask`.

After patching `LowerAVXCONCAT_VECTORS`, lowering 
https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/avx-intrinsics-fast-isel.ll#L257
 generates:

  vblendps  $15, %ymm0, %ymm0, %ymm0

To make it fully no-op, tablegen files must be edited. I gave it a try, `.td` 
compiled successfully, but weirdly - perhaps due to either incorrect use of 
tablegen's pattern matcher or some hidden rule that I didn't address - 
`vblendps` is still there. The written patch is as follows:
https://github.com/aqjune/llvm-project/commit/b4393e36b33ca08ce77ae662479ceaf9a76eab8b

One of relevant, edited parts:

  // llvm/lib/Target/X86/X86InstrVecCompiler.td
def : Pat<(VT (insert_subvector undef, subRC:$src, (iPTR 0))),
  (VT (INSERT_SUBREG (IMPLICIT_DEF), subRC:$src, subIdx))>;
  +
  +  def : Pat<(VT (insert_subvector (freeze undef), subRC:$src, (iPTR 0))),
  +(VT (INSERT_SUBREG (IMPLICIT_DEF), subRC:$src, subIdx))>;

I spent some time but couldn't figure out why it does not work.
Can someone tell me whether the pattern matching is being correctly used? Any 
help is appreciated.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2022-06-27 Thread Nikita Popov via Phabricator via cfe-commits
nikic added a comment.

Which intrinsic are you working on here? If this is about the mm_undefined 
intrinsics, why do we need to change those from the current status quo of using 
a zero value instead of undef?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2022-06-27 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune added a comment.

In D103874#3611483 , @nikic wrote:

> Which intrinsic are you working on here? If this is about the mm_undefined 
> intrinsics, why do we need to change those from the current status quo of 
> using a zero value instead of undef?

It is about the `mm256_castpd128_pd256` intrinsic and its friends 
(clang/test/CodeGen/X86/avx-builtins.c, line 146).
It was previously using `shufflevector` with undef masks - since the results 
are poison, an alternative pattern as below is necessary to represent the 
intrinsic:

  %a1 = freeze <2 x double> poison
  %res = shufflevector <2 x double> %a0, <2 x double> %a1, <4 x i32> 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2022-06-27 Thread Nikita Popov via Phabricator via cfe-commits
nikic added a comment.

In D103874#3611507 , @aqjune wrote:

> In D103874#3611483 , @nikic wrote:
>
>> Which intrinsic are you working on here? If this is about the mm_undefined 
>> intrinsics, why do we need to change those from the current status quo of 
>> using a zero value instead of undef?
>
> It is about the `mm256_castpd128_pd256` intrinsic and its friends 
> (clang/test/CodeGen/X86/avx-builtins.c, line 146).
> It was previously using `shufflevector` with undef masks - since the results 
> are poison, an alternative pattern as below is necessary to represent the 
> intrinsic:
>
>   %a1 = freeze <2 x double> poison
>   %res = shufflevector <2 x double> %a0, <2 x double> %a1, <4 x i32>  i32 1, i32 2, i32 3>

How sure are we that we cannot simply use `poison` elements here? I checked 
what the Intel compiler guide has to say on the topic, and it uses the 
following wording. For "undefined" style intrinsics:

> This intrinsic returns a vector of eight single precision floating point 
> elements. The content of the vector is not specified.

For "cast" style intrinsics:

> The lower 128-bits of the 256-bit resulting vector contains the source vector 
> values; the upper 128-bits of the resulting vector are undefined. This 
> intrinsic does not introduce extra moves to the generated code

It's not really clear what "undefined" is supposed to mean here (and how it 
differs from "not specified").

Unless we're aware of a specific problems in this area, I think it's okay to 
start out with just doing the undef -> poison replacement, and possibly 
backtrack if there are real-world assumptions about the specific meaning of 
"undefined" in this context.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2022-06-27 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune added a comment.

PR31524 (https://github.com/llvm/llvm-project/issues/31524) discusses about the 
lowering of such intrinsics.
According to the PR, it seems users consider undefined intrinsics as returning 
unknown but consistent bits.

If it works, my updates in the draft 
(https://github.com/aqjune/llvm-project/commit/b4393e36b33ca08ce77ae662479ceaf9a76eab8b)
 will improve backends dealing with freeze(poison), which will help lowering 
`_mm256_cast*` with zero copy.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2022-08-10 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune added inline comments.



Comment at: clang/test/CodeGen/X86/avx-builtins.c:182
   // CHECK-LABEL: test_mm256_castsi128_si256
-  // CHECK: shufflevector <2 x i64> %{{.*}}, <2 x i64> %{{.*}}, <4 x i32> 
+  // CHECK: shufflevector <2 x i64> %{{.*}}, <2 x i64> %{{.*}}, <4 x i32> 
   return _mm256_castsi128_si256(A);

aqjune wrote:
> RKSimon wrote:
> > efriedma wrote:
> > > This change might be visible to user code.
> > Yes the length changing casts are worrying me as well - we could update the 
> > header to insert zero into the upper elements I suppose, in many cases 
> > these would be folded away by AVX ops implicitly zeroing the 128-bits. But 
> > we'd definitely have the potential for regressions.
> I quickly skimmed through the headers in clang/lib/Headers and listed the 
> functions calling `__builtin_shufflevector` with at least one -1 mask operand.
> It seems there aren't very many, which is good news; I found 17 functions 
> only ({F19257445}).
> 
> But, correctly fixing these headers seems to require a lot of work.
> Since using the zero vector can cause performance regressions, we need to use 
> a frozen poison (undef) vector to encode a vector having unspecified bits.
> A few months ago, I created D104790 to start using freeze(vector poison) for 
> `mm*_undefined*` intrinsics. However, teaching the existing codebase to 
> successfully deal with the frozen poison vector was a pretty tough job.
> When it comes to fixing the headers, there is even no C intrinsic function 
> that represents a frozen poison vector AFAIK.
> 
> I'll appreciate any idea or help in addressing this issue. :/
Okay, D130339 has finally been merged.

I will make a patch that updates the `mm256_castsi128_si256` and its family 
functions to emit shufflevector with freeze(poison) operand.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2021-09-26 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune updated this revision to Diff 375094.
aqjune added a comment.
Herald added a subscriber: arphaman.

Rebase


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

Files:
  clang/test/CodeGen/X86/avx-builtins.c
  clang/test/CodeGen/X86/avx512f-builtins.c
  clang/test/CodeGen/builtins-ppc-p10vector.c
  clang/test/CodeGen/builtins-ppc-p9vector.c
  clang/test/CodeGen/builtinshufflevector2.c
  clang/test/CodeGen/ext-vector.c
  clang/test/CodeGenOpenCL/as_type.cl
  clang/test/CodeGenOpenCL/partial_initializer.cl
  clang/test/CodeGenOpenCL/preserve_vec3.cl
  clang/test/CodeGenOpenCL/vector_literals.cl
  clang/test/CodeGenOpenCL/vector_shufflevector.cl
  llvm/include/llvm-c/Core.h
  llvm/include/llvm/IR/Instructions.h
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/lib/Analysis/ValueTracking.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ConstantFold.cpp
  llvm/lib/IR/Core.cpp
  llvm/lib/IR/Instructions.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
  llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
  llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
  llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
  llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
  llvm/lib/Transforms/Vectorize/VectorCombine.cpp
  llvm/test/Analysis/CostModel/AMDGPU/shufflevector.ll
  llvm/test/Analysis/CostModel/X86/reduction.ll
  llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector.ll
  llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector.ll
  llvm/test/Analysis/CostModel/X86/shuffle-single-src.ll
  llvm/test/CodeGen/AMDGPU/rewrite-out-arguments.ll
  llvm/test/CodeGen/Generic/expand-experimental-reductions.ll
  llvm/test/CodeGen/PowerPC/arg_promotion.ll
  llvm/test/Transforms/CodeGenPrepare/X86/x86-shuffle-sink-inseltpoison.ll
  llvm/test/Transforms/CodeGenPrepare/X86/x86-shuffle-sink.ll
  llvm/test/Transforms/DeadStoreElimination/masked-dead-store-inseltpoison.ll
  llvm/test/Transforms/DeadStoreElimination/masked-dead-store.ll
  
llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts-inseltpoison.ll
  llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll
  llvm/test/Transforms/InstCombine/X86/x86-addsub-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-addsub.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx2-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx2.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx512-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx512.ll
  llvm/test/Transforms/InstCombine/X86/x86-muldq-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-muldq.ll
  llvm/test/Transforms/InstCombine/X86/x86-pack-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-pack.ll
  llvm/test/Transforms/InstCombine/X86/x86-pshufb-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-pshufb.ll
  llvm/test/Transforms/InstCombine/X86/x86-sse4a-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-sse4a.ll
  llvm/test/Transforms/InstCombine/X86/x86-vpermil-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-vpermil.ll
  llvm/test/Transforms/InstCombine/broadcast-inseltpoison.ll
  llvm/test/Transforms/InstCombine/broadcast.ll
  llvm/test/Transforms/InstCombine/bswap-inseltpoison.ll
  llvm/test/Transforms/InstCombine/bswap.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll
  llvm/test/Transforms/InstCombine/extractelement-inseltpoison.ll
  llvm/test/Transforms/InstCombine/extractelement.ll
  llvm/test/Transforms/InstCombine/insert-const-shuf.ll
  llvm/test/Transforms/InstCombine/insert-extract-shuffle-inseltpoison.ll
  llvm/test/Transforms/InstCombine/insert-extract-shuffle.ll
  llvm/test/Transforms/InstCombine/masked_intrinsics-inseltpoison.ll
  llvm/test/Transforms/InstCombine/masked_intrinsics.ll
  llvm/test/Transforms/InstCombine/nsw-inseltpoison.ll
  llvm/test/Transforms/InstCombine/nsw.ll
  llvm/test/Transforms/InstCombine/reduction-shufflevector.ll
  llvm/test/Transforms/InstCombine/select-extractelement-inseltpoison.ll
  llvm/test/Transforms/InstCombine/select-extractelement.ll
  llvm/test/Transforms/InstCombine/select-select.ll
  llvm/test/Transforms/InstCombine/shuffle-select-narrow-inseltpoison.ll
  llvm/test/Transforms/InstCombine/shuffle-select-narrow.ll
  llvm/test/Transforms/InstCombine/shuffle_select-inseltpoison.ll
  llvm/test/Transforms/InstCombine/shuffle_select.ll
  llvm/test/Transforms/InstCombine/shufflevec-bitcast-inseltpoison.ll
  llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll
  llvm/test/Transforms/InstCombine/shufflevector-div-rem-inseltpoison.ll
  llvm/test/Transforms/InstCombine/shufflevector-div-rem.ll
  llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
  llvm/test/Transforms/InstCombine/sub-of-negatible.ll
  llvm/test/Transform

[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2021-09-26 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune updated this revision to Diff 375095.
aqjune added a comment.

Resurrect mistakenly removed test statements


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

Files:
  clang/test/CodeGen/X86/avx-builtins.c
  clang/test/CodeGen/X86/avx512f-builtins.c
  clang/test/CodeGen/builtins-ppc-p10vector.c
  clang/test/CodeGen/builtins-ppc-p9vector.c
  clang/test/CodeGen/builtinshufflevector2.c
  clang/test/CodeGen/ext-vector.c
  clang/test/CodeGenOpenCL/as_type.cl
  clang/test/CodeGenOpenCL/partial_initializer.cl
  clang/test/CodeGenOpenCL/preserve_vec3.cl
  clang/test/CodeGenOpenCL/vector_literals.cl
  clang/test/CodeGenOpenCL/vector_shufflevector.cl
  llvm/include/llvm-c/Core.h
  llvm/include/llvm/IR/Instructions.h
  llvm/lib/Analysis/InstructionSimplify.cpp
  llvm/lib/Analysis/ValueTracking.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ConstantFold.cpp
  llvm/lib/IR/Core.cpp
  llvm/lib/IR/Instructions.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
  llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
  llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
  llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
  llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
  llvm/lib/Transforms/Vectorize/VectorCombine.cpp
  llvm/test/Analysis/CostModel/AMDGPU/shufflevector.ll
  llvm/test/Analysis/CostModel/X86/reduction.ll
  llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector.ll
  llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector.ll
  llvm/test/Analysis/CostModel/X86/shuffle-single-src.ll
  llvm/test/CodeGen/AMDGPU/rewrite-out-arguments.ll
  llvm/test/CodeGen/Generic/expand-experimental-reductions.ll
  llvm/test/CodeGen/PowerPC/arg_promotion.ll
  llvm/test/Transforms/CodeGenPrepare/X86/x86-shuffle-sink-inseltpoison.ll
  llvm/test/Transforms/CodeGenPrepare/X86/x86-shuffle-sink.ll
  llvm/test/Transforms/DeadStoreElimination/masked-dead-store-inseltpoison.ll
  llvm/test/Transforms/DeadStoreElimination/masked-dead-store.ll
  
llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts-inseltpoison.ll
  llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll
  llvm/test/Transforms/InstCombine/X86/x86-addsub-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-addsub.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx2-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx2.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx512-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-avx512.ll
  llvm/test/Transforms/InstCombine/X86/x86-muldq-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-muldq.ll
  llvm/test/Transforms/InstCombine/X86/x86-pack-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-pack.ll
  llvm/test/Transforms/InstCombine/X86/x86-pshufb-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-pshufb.ll
  llvm/test/Transforms/InstCombine/X86/x86-sse4a-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-sse4a.ll
  llvm/test/Transforms/InstCombine/X86/x86-vpermil-inseltpoison.ll
  llvm/test/Transforms/InstCombine/X86/x86-vpermil.ll
  llvm/test/Transforms/InstCombine/broadcast-inseltpoison.ll
  llvm/test/Transforms/InstCombine/broadcast.ll
  llvm/test/Transforms/InstCombine/bswap-inseltpoison.ll
  llvm/test/Transforms/InstCombine/bswap.ll
  llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll
  llvm/test/Transforms/InstCombine/extractelement-inseltpoison.ll
  llvm/test/Transforms/InstCombine/extractelement.ll
  llvm/test/Transforms/InstCombine/insert-const-shuf.ll
  llvm/test/Transforms/InstCombine/insert-extract-shuffle-inseltpoison.ll
  llvm/test/Transforms/InstCombine/insert-extract-shuffle.ll
  llvm/test/Transforms/InstCombine/masked_intrinsics-inseltpoison.ll
  llvm/test/Transforms/InstCombine/masked_intrinsics.ll
  llvm/test/Transforms/InstCombine/nsw-inseltpoison.ll
  llvm/test/Transforms/InstCombine/nsw.ll
  llvm/test/Transforms/InstCombine/reduction-shufflevector.ll
  llvm/test/Transforms/InstCombine/select-extractelement-inseltpoison.ll
  llvm/test/Transforms/InstCombine/select-extractelement.ll
  llvm/test/Transforms/InstCombine/select-select.ll
  llvm/test/Transforms/InstCombine/shuffle-select-narrow-inseltpoison.ll
  llvm/test/Transforms/InstCombine/shuffle-select-narrow.ll
  llvm/test/Transforms/InstCombine/shuffle_select-inseltpoison.ll
  llvm/test/Transforms/InstCombine/shuffle_select.ll
  llvm/test/Transforms/InstCombine/shufflevec-bitcast-inseltpoison.ll
  llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll
  llvm/test/Transforms/InstCombine/shufflevector-div-rem-inseltpoison.ll
  llvm/test/Transforms/InstCombine/shufflevector-div-rem.ll
  llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
  llvm/test/Transforms/InstCombine/sub-of-negatible.ll
  llvm/test/Transfor

[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2021-09-26 Thread Juneyoung Lee via Phabricator via cfe-commits
aqjune marked an inline comment as done.
aqjune added inline comments.



Comment at: clang/test/CodeGen/X86/avx-builtins.c:182
   // CHECK-LABEL: test_mm256_castsi128_si256
-  // CHECK: shufflevector <2 x i64> %{{.*}}, <2 x i64> %{{.*}}, <4 x i32> 
+  // CHECK: shufflevector <2 x i64> %{{.*}}, <2 x i64> %{{.*}}, <4 x i32> 
   return _mm256_castsi128_si256(A);

RKSimon wrote:
> efriedma wrote:
> > This change might be visible to user code.
> Yes the length changing casts are worrying me as well - we could update the 
> header to insert zero into the upper elements I suppose, in many cases these 
> would be folded away by AVX ops implicitly zeroing the 128-bits. But we'd 
> definitely have the potential for regressions.
I quickly skimmed through the headers in clang/lib/Headers and listed the 
functions calling `__builtin_shufflevector` with at least one -1 mask operand.
It seems there aren't very many, which is good news; I found 17 functions only 
({F19257445}).

But, correctly fixing these headers seems to require a lot of work.
Since using the zero vector can cause performance regressions, we need to use a 
frozen poison (undef) vector to encode a vector having unspecified bits.
A few months ago, I created D104790 to start using freeze(vector poison) for 
`mm*_undefined*` intrinsics. However, teaching the existing codebase to 
successfully deal with the frozen poison vector was a pretty tough job.
When it comes to fixing the headers, there is even no C intrinsic function that 
represents a frozen poison vector AFAIK.

I'll appreciate any idea or help in addressing this issue. :/


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103874: [IR] Rename the shufflevector's undef mask to poison

2021-09-20 Thread Simon Pilgrim via Phabricator via cfe-commits
RKSimon added inline comments.



Comment at: clang/test/CodeGen/X86/avx-builtins.c:182
   // CHECK-LABEL: test_mm256_castsi128_si256
-  // CHECK: shufflevector <2 x i64> %{{.*}}, <2 x i64> %{{.*}}, <4 x i32> 
+  // CHECK: shufflevector <2 x i64> %{{.*}}, <2 x i64> %{{.*}}, <4 x i32> 
   return _mm256_castsi128_si256(A);

efriedma wrote:
> This change might be visible to user code.
Yes the length changing casts are worrying me as well - we could update the 
header to insert zero into the upper elements I suppose, in many cases these 
would be folded away by AVX ops implicitly zeroing the 128-bits. But we'd 
definitely have the potential for regressions.



Comment at: clang/test/CodeGen/X86/avx-builtins.c:1239
   // CHECK: shufflevector <4 x float> %{{.*}}, <4 x float> poison, <8 x i32> 

   // CHECK: shufflevector <8 x float> %{{.*}}, <8 x float> %{{.*}}, <8 x i32> 

   return _mm256_loadu2_m128(A, B);

These look out of date - D109497 changes the loadu2 codegen to be a single 
'concat' shuffle.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103874/new/

https://reviews.llvm.org/D103874

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits