[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-16 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

In D110257#3133934 , @JonChesterfield 
wrote:

> In D110257#3133895 , @hsmhsm wrote:
>
>> In D110257#3133879 , 
>> @JonChesterfield wrote:
>>
>>> In D110257#3133866 , @hsmhsm 
>>> wrote:
>>>
 This is not something specific to AMDGPU backend, but  AMDGPU backend at 
 present requires this canonical form.
>>>
>>> Undocumented and not checked by the IR verifier. Canonical form seems to be 
>>> overstating it until at least one of those is addressed.
>>
>> We already discussed that this canonical form is not something that IR 
>> verifier can verify, but it is good enough for better code 
>> transformation/optimization. Please refer llvm-dev email discussion w.r.t it.
>
> If the new invariant is that all alloca must be adjacent to one another, 
> that's a trivial thing for the verifier to check. So I guess it's something 
> else? Please write down what this new invariant is intended to be, preferably 
> in the documentation, perhaps of the alloca instruction.

Please check with llvm-dev.

> What llvm-dev discussion do you refer to?

I do not remember,  please search for keywords, like static allocas, and figure 
out.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-16 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

In D110257#3133879 , @JonChesterfield 
wrote:

> In D110257#3133866 , @hsmhsm wrote:
>
>> This is not something specific to AMDGPU backend, but  AMDGPU backend at 
>> present requires this canonical form.
>
> Undocumented and not checked by the IR verifier. Canonical form seems to be 
> overstating it until at least one of those is addressed.

We already discussed that this canonical form is not something that IR verifier 
can verify, but it is good enough for better code transformation/optimization. 
Please refer llvm-dev email discussion w.r.t it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-16 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

In D110257#3133838 , @JonChesterfield 
wrote:

> Please change the commit message to say why this change is necessary / an 
> improvement on what we have now.
>
> My recollection is that the amdgpu backend crashes on some IR and this 
> decreases the probability of that IR pattern occuring, which still sounds 
> like fixing the wrong place to me. Was this the one where hoisting static 
> size alloca into the entry block near the backend would the problem?
>
> I think this patch is missing a documentation update adding the new 
> constraint that allocas must be contiguous in IR. That would help to answer 
> questions about which alloca must be contiguous and which can occur separated 
> by instructions, as currently none of them need to be adjacent. Also, is this 
> only intended to constrain the entry basic block?

The current commit message reflect semantics of the patch - there is nothing 
required to change here. The goal here is to make sure that FE keeps all the 
static allocas as one cluster at the start of the entry block, which is a good 
canonical form from the perspective of better code transformation/optimization.

This is not something specific to AMDGPU backend, but  AMDGPU backend at 
present requires this canonical form.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-09 Thread Mahesha S via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG3b9a85d10ac7: [CFE][Codegen] Make sure to maintain the 
contiguity of all the static allocas (authored by hsmhsm).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,7 +12,9 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
 #pragma omp target
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,34 +1,33 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-09 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

Thanks @rnk and @yaxunl .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-03 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

This patch is waiting for an action for a long time. I am expecting at-least 
anyone of the reviewers to either say "yes" or "no" or "further required 
improvements" on this patch so that I can take further action on this patch.  
If you say "no" to this patch with a strong valid reason, then I will abandon 
this patch and move-on instead of uncertainly waiting.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-03 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 384661.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,7 +12,9 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
 #pragma omp target
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,34 +1,33 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[STR_ASCAS

[PATCH] D112791: [IR] Merge createReplacementInstr into ConstantExpr::getAsInstruction

2021-10-29 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm accepted this revision.
hsmhsm added a comment.
This revision is now accepted and ready to land.

Thanks for this clean-up patch. Looks good to me. However, please wait for some 
time if in case other reviewers have any comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112791/new/

https://reviews.llvm.org/D112791

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-10-28 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

@rjmccall I assume, I have fixed all your review comments. In case, if I have 
missed something OR if you think, few more changes are required for the patch, 
please do let me know so that I will proceed as per the comments/suggestions. I 
would like to bring this patch to the closure.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-10-21 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-10-20 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 380871.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,7 +12,9 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
 #pragma omp target
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,34 +1,33 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[STR_ASCAS

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-10-18 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 380320.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,7 +12,9 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
 #pragma omp target
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,34 +1,33 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[STR_ASCAS

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-10-14 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 379918.
hsmhsm marked 2 inline comments as done.
hsmhsm added a comment.

Rebase + minor clean-up to patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,7 +12,9 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
 #pragma omp target
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,34 +1,33 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-10-14 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm marked 2 inline comments as done.
hsmhsm added inline comments.



Comment at: clang/lib/CodeGen/CGExpr.cpp:115
+if (AllocaInsertedAtAllocaInsertPt)
+  AllocaAddrSpaceInsertPt = dyn_cast(V)->getIterator();
   }

arichardson wrote:
> Shouldn't this use `cast` instead?
You are right. But, in the updated patch, this code does not exist anymore.



Comment at: clang/lib/CodeGen/CodeGenFunction.h:392
+  /// order immediately after all static allocas.
+  llvm::BasicBlock::iterator AllocaAddrSpaceInsertPt;
+

rjmccall wrote:
> Please call this something like `PostAllocaInsertPt`.  Instead of eagerly 
> creating it, please create it lazily: add an accessor like 
> `getPostAllocaInsertPoint()` which creates (and saves here) an instruction 
> that immediately follows `AllocaInsertPt` if this is currently null.  I think 
> you should make your own instruction so that we don't mess up any code that 
> might rely on temporarily creating and then removing dead instructions.
> 
> Please use an `llvm::AssertingVH`.  I think that can 
> handle holding a null value.
> 
> The comment here should be more general, like "a place in the prologue where 
> code can be inserted that will be dominated by all the static allocas."  You 
> don't need to talk about  `addrspacecast`s specifically; that's just one 
> possible use case for this.
> 
> Also, it's either "`addrspacecast`" (using the IR name of the operation) or 
> "address space cast" (writing out the operation in normal English words); 
> don't run together `addressspace` as if it were a keyword when it isn't.
Thanks. Fixed the above review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-10-14 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 379682.
hsmhsm added a comment.

Fix review comments by @rjmccall.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,7 +12,9 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
 #pragma omp target
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,34 +1,33 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// 

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-10-11 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378875.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,7 +12,9 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
 #pragma omp target
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,34 +1,33 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[STR_ASCAS

[PATCH] D111324: [CFE][Codegen] Remove CodeGenFunction::InitTempAlloca()

2021-10-11 Thread Mahesha S via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGdb9c2d775130: [CFE][Codegen] Remove 
CodeGenFunction::InitTempAlloca() (authored by hsmhsm).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111324/new/

https://reviews.llvm.org/D111324

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.h


Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -2545,15 +2545,6 @@
   Address CreateDefaultAlignTempAlloca(llvm::Type *Ty,
const Twine &Name = "tmp");
 
-  /// InitTempAlloca - Provide an initial value for the given alloca which
-  /// will be observable at all locations in the function.
-  ///
-  /// The address should be something that was returned from one of
-  /// the CreateTempAlloca or CreateMemTemp routines, and the
-  /// initializer must be valid in the entry block (i.e. it must
-  /// either be a constant or an argument value).
-  void InitTempAlloca(Address Alloca, llvm::Value *Value);
-
   /// CreateIRTemp - Create a temporary IR object of the given type, with
   /// appropriate alignment. This routine should only be used when an temporary
   /// value needs to be stored into an alloca (for example, to avoid explicit
Index: clang/lib/CodeGen/CGExpr.cpp
===
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -127,19 +127,6 @@
   return CreateTempAlloca(Ty, Align, Name);
 }
 
-void CodeGenFunction::InitTempAlloca(Address Var, llvm::Value *Init) {
-  auto *Alloca = Var.getPointer();
-  assert(isa(Alloca) ||
- (isa(Alloca) &&
-  isa(
-  cast(Alloca)->getPointerOperand(;
-
-  auto *Store = new llvm::StoreInst(Init, Alloca, /*volatile*/ false,
-Var.getAlignment().getAsAlign());
-  llvm::BasicBlock *Block = AllocaInsertPt->getParent();
-  Block->getInstList().insertAfter(AllocaInsertPt->getIterator(), Store);
-}
-
 Address CodeGenFunction::CreateIRTemp(QualType Ty, const Twine &Name) {
   CharUnits Align = getContext().getTypeAlignInChars(Ty);
   return CreateTempAlloca(ConvertType(Ty), Align, Name);


Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -2545,15 +2545,6 @@
   Address CreateDefaultAlignTempAlloca(llvm::Type *Ty,
const Twine &Name = "tmp");
 
-  /// InitTempAlloca - Provide an initial value for the given alloca which
-  /// will be observable at all locations in the function.
-  ///
-  /// The address should be something that was returned from one of
-  /// the CreateTempAlloca or CreateMemTemp routines, and the
-  /// initializer must be valid in the entry block (i.e. it must
-  /// either be a constant or an argument value).
-  void InitTempAlloca(Address Alloca, llvm::Value *Value);
-
   /// CreateIRTemp - Create a temporary IR object of the given type, with
   /// appropriate alignment. This routine should only be used when an temporary
   /// value needs to be stored into an alloca (for example, to avoid explicit
Index: clang/lib/CodeGen/CGExpr.cpp
===
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -127,19 +127,6 @@
   return CreateTempAlloca(Ty, Align, Name);
 }
 
-void CodeGenFunction::InitTempAlloca(Address Var, llvm::Value *Init) {
-  auto *Alloca = Var.getPointer();
-  assert(isa(Alloca) ||
- (isa(Alloca) &&
-  isa(
-  cast(Alloca)->getPointerOperand(;
-
-  auto *Store = new llvm::StoreInst(Init, Alloca, /*volatile*/ false,
-Var.getAlignment().getAsAlign());
-  llvm::BasicBlock *Block = AllocaInsertPt->getParent();
-  Block->getInstList().insertAfter(AllocaInsertPt->getIterator(), Store);
-}
-
 Address CodeGenFunction::CreateIRTemp(QualType Ty, const Twine &Name) {
   CharUnits Align = getContext().getTypeAlignInChars(Ty);
   return CreateTempAlloca(ConvertType(Ty), Align, Name);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111316: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-11 Thread Mahesha S via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGf7de6962c85b: [CFE][Codegen][In-progress] Remove 
CodeGenFunction::InitTempAlloca() (authored by hsmhsm).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111316/new/

https://reviews.llvm.org/D111316

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-10-09 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378424.
hsmhsm added a comment.

Introduce a new instruction pointer which aid all the addressspace casts of 
static allocas
to appear in the source order immediately after all static allocas.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,7 +12,9 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
 #pragma omp target
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,34 +1,33 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(

[PATCH] D111324: [CFE][Codegen] Remove CodeGenFunction::InitTempAlloca()

2021-10-08 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378413.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111324/new/

https://reviews.llvm.org/D111324

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.h


Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -2545,15 +2545,6 @@
   Address CreateDefaultAlignTempAlloca(llvm::Type *Ty,
const Twine &Name = "tmp");
 
-  /// InitTempAlloca - Provide an initial value for the given alloca which
-  /// will be observable at all locations in the function.
-  ///
-  /// The address should be something that was returned from one of
-  /// the CreateTempAlloca or CreateMemTemp routines, and the
-  /// initializer must be valid in the entry block (i.e. it must
-  /// either be a constant or an argument value).
-  void InitTempAlloca(Address Alloca, llvm::Value *Value);
-
   /// CreateIRTemp - Create a temporary IR object of the given type, with
   /// appropriate alignment. This routine should only be used when an temporary
   /// value needs to be stored into an alloca (for example, to avoid explicit
Index: clang/lib/CodeGen/CGExpr.cpp
===
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -127,19 +127,6 @@
   return CreateTempAlloca(Ty, Align, Name);
 }
 
-void CodeGenFunction::InitTempAlloca(Address Var, llvm::Value *Init) {
-  auto *Alloca = Var.getPointer();
-  assert(isa(Alloca) ||
- (isa(Alloca) &&
-  isa(
-  cast(Alloca)->getPointerOperand(;
-
-  auto *Store = new llvm::StoreInst(Init, Alloca, /*volatile*/ false,
-Var.getAlignment().getAsAlign());
-  llvm::BasicBlock *Block = AllocaInsertPt->getParent();
-  Block->getInstList().insertAfter(AllocaInsertPt->getIterator(), Store);
-}
-
 Address CodeGenFunction::CreateIRTemp(QualType Ty, const Twine &Name) {
   CharUnits Align = getContext().getTypeAlignInChars(Ty);
   return CreateTempAlloca(ConvertType(Ty), Align, Name);


Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -2545,15 +2545,6 @@
   Address CreateDefaultAlignTempAlloca(llvm::Type *Ty,
const Twine &Name = "tmp");
 
-  /// InitTempAlloca - Provide an initial value for the given alloca which
-  /// will be observable at all locations in the function.
-  ///
-  /// The address should be something that was returned from one of
-  /// the CreateTempAlloca or CreateMemTemp routines, and the
-  /// initializer must be valid in the entry block (i.e. it must
-  /// either be a constant or an argument value).
-  void InitTempAlloca(Address Alloca, llvm::Value *Value);
-
   /// CreateIRTemp - Create a temporary IR object of the given type, with
   /// appropriate alignment. This routine should only be used when an temporary
   /// value needs to be stored into an alloca (for example, to avoid explicit
Index: clang/lib/CodeGen/CGExpr.cpp
===
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -127,19 +127,6 @@
   return CreateTempAlloca(Ty, Align, Name);
 }
 
-void CodeGenFunction::InitTempAlloca(Address Var, llvm::Value *Init) {
-  auto *Alloca = Var.getPointer();
-  assert(isa(Alloca) ||
- (isa(Alloca) &&
-  isa(
-  cast(Alloca)->getPointerOperand(;
-
-  auto *Store = new llvm::StoreInst(Init, Alloca, /*volatile*/ false,
-Var.getAlignment().getAsAlign());
-  llvm::BasicBlock *Block = AllocaInsertPt->getParent();
-  Block->getInstList().insertAfter(AllocaInsertPt->getIterator(), Store);
-}
-
 Address CodeGenFunction::CreateIRTemp(QualType Ty, const Twine &Name) {
   CharUnits Align = getContext().getTypeAlignInChars(Ty);
   return CreateTempAlloca(ConvertType(Ty), Align, Name);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111316: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-08 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378412.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111316/new/

https://reviews.llvm.org/D111316

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111293: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-08 Thread Mahesha S via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG04816829968c: [CFE][Codegen][In-progress] Remove 
CodeGenFunction::InitTempAlloca() (authored by hsmhsm).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111293/new/

https://reviews.llvm.org/D111293

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp


Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -981,7 +981,8 @@
   // precise source location of the checked return statement.
   if (requiresReturnValueCheck()) {
 ReturnLocation = CreateDefaultAlignTempAlloca(Int8PtrTy, 
"return.sloc.ptr");
-InitTempAlloca(ReturnLocation, llvm::ConstantPointerNull::get(Int8PtrTy));
+Builder.CreateStore(llvm::ConstantPointerNull::get(Int8PtrTy),
+ReturnLocation);
   }
 
   // Emit subprogram debug descriptor.


Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -981,7 +981,8 @@
   // precise source location of the checked return statement.
   if (requiresReturnValueCheck()) {
 ReturnLocation = CreateDefaultAlignTempAlloca(Int8PtrTy, "return.sloc.ptr");
-InitTempAlloca(ReturnLocation, llvm::ConstantPointerNull::get(Int8PtrTy));
+Builder.CreateStore(llvm::ConstantPointerNull::get(Int8PtrTy),
+ReturnLocation);
   }
 
   // Emit subprogram debug descriptor.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111293: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-08 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378408.
hsmhsm added a comment.

Remove newly added test CodeGenObjC/static-alloc-init.m


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111293/new/

https://reviews.llvm.org/D111293

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp


Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -981,7 +981,8 @@
   // precise source location of the checked return statement.
   if (requiresReturnValueCheck()) {
 ReturnLocation = CreateDefaultAlignTempAlloca(Int8PtrTy, 
"return.sloc.ptr");
-InitTempAlloca(ReturnLocation, llvm::ConstantPointerNull::get(Int8PtrTy));
+Builder.CreateStore(llvm::ConstantPointerNull::get(Int8PtrTy),
+ReturnLocation);
   }
 
   // Emit subprogram debug descriptor.


Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -981,7 +981,8 @@
   // precise source location of the checked return statement.
   if (requiresReturnValueCheck()) {
 ReturnLocation = CreateDefaultAlignTempAlloca(Int8PtrTy, "return.sloc.ptr");
-InitTempAlloca(ReturnLocation, llvm::ConstantPointerNull::get(Int8PtrTy));
+Builder.CreateStore(llvm::ConstantPointerNull::get(Int8PtrTy),
+ReturnLocation);
   }
 
   // Emit subprogram debug descriptor.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111293: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-08 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378157.
hsmhsm added a comment.

Rebase to latest trunk.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111293/new/

https://reviews.llvm.org/D111293

Files:
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/test/CodeGenObjC/static-alloc-init.m


Index: clang/test/CodeGenObjC/static-alloc-init.m
===
--- /dev/null
+++ clang/test/CodeGenObjC/static-alloc-init.m
@@ -0,0 +1,45 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-freebsd -S -emit-llvm 
-fobjc-runtime=gnustep-2.0 -o - %s | FileCheck %s
+
+typedef struct {
+  int x[12];
+} Big;
+
+@protocol P
+- (Big) foo;
+@end
+
+// CHECK-LABEL: define{{.*}} void @test(
+// CHECK:   %p.addr = alloca i8*, align 8
+// CHECK-NEXT:   %big = alloca %struct.Big, align 4
+// CHECK-NEXT:   %tmp = alloca i8*, align 8
+// CHECK-NEXT:   store i8* %p, i8** %p.addr, align 8
+// CHECK-NEXT:   br label %for.cond
+//
+// CHECK-LABEL: for.cond:
+// CHECK:%0 = load i8*, i8** %p.addr, align 8
+// CHECK-NEXT:   %1 = icmp eq i8* %0, null
+// CHECK-NEXT:   br i1 %1, label %nilReceiverCleanup, label %msgSend
+//
+// CHECK-LABEL: msgSend:
+// CHECK:store i8* %0, i8** %tmp, align 8
+// CHECK-NEXT:   %2 = call { i8*, i8*, i8*, i32, i8* (i8*, i8*, ...)* }* 
@objc_msg_lookup_sender(i8** %tmp, i8* bitcast ({ i8*, i8* }* 
@".objc_selector_foo_{?=[12i]}16\010:8" to i8*), i8* null)
+// CHECK-NEXT:   %3 = getelementptr inbounds { i8*, i8*, i8*, i32, i8* (i8*, 
i8*, ...)* }, { i8*, i8*, i8*, i32, i8* (i8*, i8*, ...)* }* %2, i32 0, i32 4
+// CHECK-NEXT:   %4 = load i8* (i8*, i8*, ...)*, i8* (i8*, i8*, ...)** %3, 
align 8
+// CHECK-NEXT:   %5 = load volatile i8*, i8** %tmp, align 8
+// CHECK-NEXT:   %6 = bitcast i8* (i8*, i8*, ...)* %4 to void (%struct.Big*, 
i8*, i8*)*
+// CHECK-NEXT:   call void %6(%struct.Big* sret(%struct.Big) align 4 %big, i8* 
%5, i8* bitcast ({ i8*, i8* }* @".objc_selector_foo_{?=[12i]}16\010:8" to i8*))
+// CHECK-NEXT:   br label %continue
+//
+// CHECK-LABEL: nilReceiverCleanup:
+// CHECK:%7 = bitcast %struct.Big* %big to i8*
+// CHECK-NEXT:   call void @llvm.memset.p0i8.i64(i8* align 4 %7, i8 0, i64 48, 
i1 false)
+// CHECK-NEXT:   br label %continue
+//
+// CHECK-LABEL: continue:
+// CHECK:br label %for.cond
+
+void test(id p) {
+ for (;;) {
+   Big big = [p foo];
+ }
+}
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -981,7 +981,8 @@
   // precise source location of the checked return statement.
   if (requiresReturnValueCheck()) {
 ReturnLocation = CreateDefaultAlignTempAlloca(Int8PtrTy, 
"return.sloc.ptr");
-InitTempAlloca(ReturnLocation, llvm::ConstantPointerNull::get(Int8PtrTy));
+Builder.CreateStore(llvm::ConstantPointerNull::get(Int8PtrTy),
+ReturnLocation);
   }
 
   // Emit subprogram debug descriptor.


Index: clang/test/CodeGenObjC/static-alloc-init.m
===
--- /dev/null
+++ clang/test/CodeGenObjC/static-alloc-init.m
@@ -0,0 +1,45 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-freebsd -S -emit-llvm -fobjc-runtime=gnustep-2.0 -o - %s | FileCheck %s
+
+typedef struct {
+  int x[12];
+} Big;
+
+@protocol P
+- (Big) foo;
+@end
+
+// CHECK-LABEL: define{{.*}} void @test(
+// CHECK:   %p.addr = alloca i8*, align 8
+// CHECK-NEXT:   %big = alloca %struct.Big, align 4
+// CHECK-NEXT:   %tmp = alloca i8*, align 8
+// CHECK-NEXT:   store i8* %p, i8** %p.addr, align 8
+// CHECK-NEXT:   br label %for.cond
+//
+// CHECK-LABEL: for.cond:
+// CHECK:%0 = load i8*, i8** %p.addr, align 8
+// CHECK-NEXT:   %1 = icmp eq i8* %0, null
+// CHECK-NEXT:   br i1 %1, label %nilReceiverCleanup, label %msgSend
+//
+// CHECK-LABEL: msgSend:
+// CHECK:store i8* %0, i8** %tmp, align 8
+// CHECK-NEXT:   %2 = call { i8*, i8*, i8*, i32, i8* (i8*, i8*, ...)* }* @objc_msg_lookup_sender(i8** %tmp, i8* bitcast ({ i8*, i8* }* @".objc_selector_foo_{?=[12i]}16\010:8" to i8*), i8* null)
+// CHECK-NEXT:   %3 = getelementptr inbounds { i8*, i8*, i8*, i32, i8* (i8*, i8*, ...)* }, { i8*, i8*, i8*, i32, i8* (i8*, i8*, ...)* }* %2, i32 0, i32 4
+// CHECK-NEXT:   %4 = load i8* (i8*, i8*, ...)*, i8* (i8*, i8*, ...)** %3, align 8
+// CHECK-NEXT:   %5 = load volatile i8*, i8** %tmp, align 8
+// CHECK-NEXT:   %6 = bitcast i8* (i8*, i8*, ...)* %4 to void (%struct.Big*, i8*, i8*)*
+// CHECK-NEXT:   call void %6(%struct.Big* sret(%struct.Big) align 4 %big, i8* %5, i8* bitcast ({ i8*, i8* }* @".objc_selector_foo_{?=[12i]}16\010:8" to i8*))
+// CHECK-NEXT:   br label %continue
+//
+// CHECK-LABEL: nilReceiverCleanup:
+// CHECK:%7 = bitcast %struct.Big* %big to i8*
+// CHECK-NEXT:   call void @llvm.memset.p0i8.i64(i8* align 4 %7, i8 0, i64 48, i1 false)
+// CHECK-NEXT:   b

[PATCH] D111324: [CFE][Codegen] Remove CodeGenFunction::InitTempAlloca()

2021-10-07 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378094.
hsmhsm added a comment.

Move OpenMP GPU offloading codegen related changes from here to 
https://reviews.llvm.org/D111324.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111324/new/

https://reviews.llvm.org/D111324

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.h


Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -2545,15 +2545,6 @@
   Address CreateDefaultAlignTempAlloca(llvm::Type *Ty,
const Twine &Name = "tmp");
 
-  /// InitTempAlloca - Provide an initial value for the given alloca which
-  /// will be observable at all locations in the function.
-  ///
-  /// The address should be something that was returned from one of
-  /// the CreateTempAlloca or CreateMemTemp routines, and the
-  /// initializer must be valid in the entry block (i.e. it must
-  /// either be a constant or an argument value).
-  void InitTempAlloca(Address Alloca, llvm::Value *Value);
-
   /// CreateIRTemp - Create a temporary IR object of the given type, with
   /// appropriate alignment. This routine should only be used when an temporary
   /// value needs to be stored into an alloca (for example, to avoid explicit
Index: clang/lib/CodeGen/CGExpr.cpp
===
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -127,19 +127,6 @@
   return CreateTempAlloca(Ty, Align, Name);
 }
 
-void CodeGenFunction::InitTempAlloca(Address Var, llvm::Value *Init) {
-  auto *Alloca = Var.getPointer();
-  assert(isa(Alloca) ||
- (isa(Alloca) &&
-  isa(
-  cast(Alloca)->getPointerOperand(;
-
-  auto *Store = new llvm::StoreInst(Init, Alloca, /*volatile*/ false,
-Var.getAlignment().getAsAlign());
-  llvm::BasicBlock *Block = AllocaInsertPt->getParent();
-  Block->getInstList().insertAfter(AllocaInsertPt->getIterator(), Store);
-}
-
 Address CodeGenFunction::CreateIRTemp(QualType Ty, const Twine &Name) {
   CharUnits Align = getContext().getTypeAlignInChars(Ty);
   return CreateTempAlloca(ConvertType(Ty), Align, Name);


Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -2545,15 +2545,6 @@
   Address CreateDefaultAlignTempAlloca(llvm::Type *Ty,
const Twine &Name = "tmp");
 
-  /// InitTempAlloca - Provide an initial value for the given alloca which
-  /// will be observable at all locations in the function.
-  ///
-  /// The address should be something that was returned from one of
-  /// the CreateTempAlloca or CreateMemTemp routines, and the
-  /// initializer must be valid in the entry block (i.e. it must
-  /// either be a constant or an argument value).
-  void InitTempAlloca(Address Alloca, llvm::Value *Value);
-
   /// CreateIRTemp - Create a temporary IR object of the given type, with
   /// appropriate alignment. This routine should only be used when an temporary
   /// value needs to be stored into an alloca (for example, to avoid explicit
Index: clang/lib/CodeGen/CGExpr.cpp
===
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -127,19 +127,6 @@
   return CreateTempAlloca(Ty, Align, Name);
 }
 
-void CodeGenFunction::InitTempAlloca(Address Var, llvm::Value *Init) {
-  auto *Alloca = Var.getPointer();
-  assert(isa(Alloca) ||
- (isa(Alloca) &&
-  isa(
-  cast(Alloca)->getPointerOperand(;
-
-  auto *Store = new llvm::StoreInst(Init, Alloca, /*volatile*/ false,
-Var.getAlignment().getAsAlign());
-  llvm::BasicBlock *Block = AllocaInsertPt->getParent();
-  Block->getInstList().insertAfter(AllocaInsertPt->getIterator(), Store);
-}
-
 Address CodeGenFunction::CreateIRTemp(QualType Ty, const Twine &Name) {
   CharUnits Align = getContext().getTypeAlignInChars(Ty);
   return CreateTempAlloca(ConvertType(Ty), Align, Name);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111316: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-07 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378093.
hsmhsm added a comment.

Move OpenMP GPU offloading codegen related changes from 
https://reviews.llvm.org/D111324 to here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111316/new/

https://reviews.llvm.org/D111316

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111293: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-07 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 378092.
hsmhsm added a comment.

Add lit test - CodeGenObjC/static-alloc-init.m


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111293/new/

https://reviews.llvm.org/D111293

Files:
  clang/lib/CodeGen/CGObjCGNU.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/test/CodeGenObjC/static-alloc-init.m


Index: clang/test/CodeGenObjC/static-alloc-init.m
===
--- /dev/null
+++ clang/test/CodeGenObjC/static-alloc-init.m
@@ -0,0 +1,43 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-freebsd -S -emit-llvm 
-fobjc-runtime=gnustep-2.0 -o - %s | FileCheck %s
+
+typedef struct {
+  int x[12];
+} Big;
+
+@protocol P
+- (Big) foo;
+@end
+
+// CHECK-LABEL:define{{.*}} void @test(
+// CHECK:   %p.addr = alloca i8*, align 8
+// CHECK-NEXT:   %big = alloca %struct.Big, align 4
+// CHECK-NEXT:   %tmp = alloca i8*, align 8
+// CHECK-NEXT:   %null = alloca %struct.Big, align 4
+// CHECK-NEXT:   store i8* %p, i8** %p.addr, align 8
+// CHECK-NEXT:   br label %for.cond
+//
+// CHECK-LABEL: for.cond:
+// CHECK-NEXT:   %0 = load i8*, i8** %p.addr, align 8
+// CHECK-NEXT:   %1 = icmp eq i8* %0, null
+// CHECK-NEXT:   br i1 %1, label %continue, label %msgSend
+//
+// CHECK-LABEL: msgSend:
+// CHECK-NEXT:   store i8* %0, i8** %tmp, align 8
+// CHECK-NEXT:   %2 = call { i8*, i8*, i8*, i32, i8* (i8*, i8*, ...)* }* 
@objc_msg_lookup_sender(i8** %tmp, i8* bitcast ({ i8*, i8* }* 
@".objc_selector_foo_{?=[12i]}16\010:8" to i8*), i8* null)
+// CHECK-NEXT:   %3 = getelementptr inbounds { i8*, i8*, i8*, i32, i8* (i8*, 
i8*, ...)* }, { i8*, i8*, i8*, i32, i8* (i8*, i8*, ...)* }* %2, i32 0, i32 4
+// CHECK-NEXT:   %4 = load i8* (i8*, i8*, ...)*, i8* (i8*, i8*, ...)** %3, 
align 8
+// CHECK-NEXT:   %5 = load volatile i8*, i8** %tmp, align 8
+// CHECK-NEXT:   %6 = bitcast i8* (i8*, i8*, ...)* %4 to void (%struct.Big*, 
i8*, i8*)*
+// CHECK-NEXT:   call void %6(%struct.Big* sret(%struct.Big) align 4 %big, i8* 
%5, i8* bitcast ({ i8*, i8* }* @".objc_selector_foo_{?=[12i]}16\010:8" to i8*))
+// CHECK-NEXT:   br label %continue
+//
+// CHECK-LABEL: continue:
+// CHECK-NEXT:   %7 = phi %struct.Big* [ %big, %msgSend ], [ %null, %for.cond ]
+// CHECK-NEXT:   store %struct.Big zeroinitializer, %struct.Big* %null, align 4
+// CHECK-NEXT:   br label %for.cond
+
+void test(id p) {
+  for (;;) {
+Big big = [p foo];
+  }
+}
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -981,7 +981,8 @@
   // precise source location of the checked return statement.
   if (requiresReturnValueCheck()) {
 ReturnLocation = CreateDefaultAlignTempAlloca(Int8PtrTy, 
"return.sloc.ptr");
-InitTempAlloca(ReturnLocation, llvm::ConstantPointerNull::get(Int8PtrTy));
+Builder.CreateStore(llvm::ConstantPointerNull::get(Int8PtrTy),
+ReturnLocation);
   }
 
   // Emit subprogram debug descriptor.
Index: clang/lib/CodeGen/CGObjCGNU.cpp
===
--- clang/lib/CodeGen/CGObjCGNU.cpp
+++ clang/lib/CodeGen/CGObjCGNU.cpp
@@ -2760,7 +2760,7 @@
   llvm::PHINode *phi = Builder.CreatePHI(v.getType(), 2);
   llvm::Type *RetTy = v.getElementType();
   Address NullVal = CGF.CreateTempAlloca(RetTy, v.getAlignment(), "null");
-  CGF.InitTempAlloca(NullVal, llvm::Constant::getNullValue(RetTy));
+  Builder.CreateStore(llvm::Constant::getNullValue(RetTy), NullVal);
   phi->addIncoming(v.getPointer(), messageBB);
   phi->addIncoming(NullVal.getPointer(), startBB);
   msgRet = RValue::getAggregate(Address(phi, v.getAlignment()));


Index: clang/test/CodeGenObjC/static-alloc-init.m
===
--- /dev/null
+++ clang/test/CodeGenObjC/static-alloc-init.m
@@ -0,0 +1,43 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-freebsd -S -emit-llvm -fobjc-runtime=gnustep-2.0 -o - %s | FileCheck %s
+
+typedef struct {
+  int x[12];
+} Big;
+
+@protocol P
+- (Big) foo;
+@end
+
+// CHECK-LABEL:define{{.*}} void @test(
+// CHECK:   %p.addr = alloca i8*, align 8
+// CHECK-NEXT:   %big = alloca %struct.Big, align 4
+// CHECK-NEXT:   %tmp = alloca i8*, align 8
+// CHECK-NEXT:   %null = alloca %struct.Big, align 4
+// CHECK-NEXT:   store i8* %p, i8** %p.addr, align 8
+// CHECK-NEXT:   br label %for.cond
+//
+// CHECK-LABEL: for.cond:
+// CHECK-NEXT:   %0 = load i8*, i8** %p.addr, align 8
+// CHECK-NEXT:   %1 = icmp eq i8* %0, null
+// CHECK-NEXT:   br i1 %1, label %continue, label %msgSend
+//
+// CHECK-LABEL: msgSend:
+// CHECK-NEXT:   store i8* %0, i8** %tmp, align 8
+// CHECK-NEXT:   %2 = call { i8*, i8*, i8*, i32, i8* (i8*, i8*, ...)* }* @objc_msg_lookup_sender(i8** %tmp, i8* bitcast ({ i8*, i8* }* @".objc_selector_foo_{?=[12i]}16\010:8" to i8*), i8*

[PATCH] D111316: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-07 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm created this revision.
hsmhsm added reviewers: rjmccall, tra, yaxunl, jdoerfert.
hsmhsm requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

Sequel patch to https://reviews.llvm.org/D111293.

Remove call to CodeGenFunction::InitTempAlloca() from OpenMP related
codegen part.

Also remove the metadata `!llvm.access.group` from the updated lit
tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D111316

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111293: [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()

2021-10-07 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm created this revision.
hsmhsm added reviewers: rjmccall, tra, yaxunl, jdoerfert.
hsmhsm requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

CodeGenFunction::InitTempAlloca() inits the static alloca within the
entry block which may *not* necessarily be correct always.

For example, the current instruction insertion point (pointed by the
instruction builder) could be a program point which is hit multiple
times during the program execution, and it is expected that the static
alloca is initialized every time the program point is hit.

Hence remove CodeGenFunction::InitTempAlloca(), and initialize the
static alloca where the instruction insertion point is at the moment.

This patch, as a starting attempt, removes the calls to
CodeGenFunction::InitTempAlloca() which do not have any side effect on
the lit tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D111293

Files:
  clang/lib/CodeGen/CGObjCGNU.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp


Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -981,7 +981,8 @@
   // precise source location of the checked return statement.
   if (requiresReturnValueCheck()) {
 ReturnLocation = CreateDefaultAlignTempAlloca(Int8PtrTy, 
"return.sloc.ptr");
-InitTempAlloca(ReturnLocation, llvm::ConstantPointerNull::get(Int8PtrTy));
+Builder.CreateStore(llvm::ConstantPointerNull::get(Int8PtrTy),
+ReturnLocation);
   }
 
   // Emit subprogram debug descriptor.
Index: clang/lib/CodeGen/CGObjCGNU.cpp
===
--- clang/lib/CodeGen/CGObjCGNU.cpp
+++ clang/lib/CodeGen/CGObjCGNU.cpp
@@ -2760,7 +2760,7 @@
   llvm::PHINode *phi = Builder.CreatePHI(v.getType(), 2);
   llvm::Type *RetTy = v.getElementType();
   Address NullVal = CGF.CreateTempAlloca(RetTy, v.getAlignment(), "null");
-  CGF.InitTempAlloca(NullVal, llvm::Constant::getNullValue(RetTy));
+  Builder.CreateStore(llvm::Constant::getNullValue(RetTy), NullVal);
   phi->addIncoming(v.getPointer(), messageBB);
   phi->addIncoming(NullVal.getPointer(), startBB);
   msgRet = RValue::getAggregate(Address(phi, v.getAlignment()));


Index: clang/lib/CodeGen/CodeGenFunction.cpp
===
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -981,7 +981,8 @@
   // precise source location of the checked return statement.
   if (requiresReturnValueCheck()) {
 ReturnLocation = CreateDefaultAlignTempAlloca(Int8PtrTy, "return.sloc.ptr");
-InitTempAlloca(ReturnLocation, llvm::ConstantPointerNull::get(Int8PtrTy));
+Builder.CreateStore(llvm::ConstantPointerNull::get(Int8PtrTy),
+ReturnLocation);
   }
 
   // Emit subprogram debug descriptor.
Index: clang/lib/CodeGen/CGObjCGNU.cpp
===
--- clang/lib/CodeGen/CGObjCGNU.cpp
+++ clang/lib/CodeGen/CGObjCGNU.cpp
@@ -2760,7 +2760,7 @@
   llvm::PHINode *phi = Builder.CreatePHI(v.getType(), 2);
   llvm::Type *RetTy = v.getElementType();
   Address NullVal = CGF.CreateTempAlloca(RetTy, v.getAlignment(), "null");
-  CGF.InitTempAlloca(NullVal, llvm::Constant::getNullValue(RetTy));
+  Builder.CreateStore(llvm::Constant::getNullValue(RetTy), NullVal);
   phi->addIncoming(v.getPointer(), messageBB);
   phi->addIncoming(NullVal.getPointer(), startBB);
   msgRet = RValue::getAggregate(Address(phi, v.getAlignment()));
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-10-06 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm marked 8 inline comments as done.
hsmhsm added a comment.

In D110257#3047166 , @rjmccall wrote:

> In the other, you're moving the store emitted by `InitTempAlloca` so that it 
> becomes interleaved with the sequence of `alloca`s, but only when there's an 
> `addrspacecast`.

Not really.

In the absence of this patch, `addrspacecast` are interleaved with the sequence 
of `alloca`s, and  `AllocaInsertPt` always point to the end of this 
*interleaved sequence*.  Within InitTempAlloca(), any init to alloca (or to its 
`addrspacecast` in case of an addrspacecast) happens just after 
`AllocaInsertPt` which is fine.

Now, in the presence of this patch, `AllocaInsertPt` points to the end of 
contiguous  `alloca`s but *BEFORE* any addrspacecast. This forces the changes 
to InitTempAlloca(). Otherwise, init of `addrspacecast` happens before 
`addrspacecast` itself.

> Now, I should say that `InitTempAlloca` is generally a problematic function.  
> Its purpose is to put an initialization in the prologue of the function so 
> that it always happens prior to some other code executing.  This is rarely 
> useful, though, because the memory is usually tied to some specific feature 
> in the code, and as Johannes says, that place in the code may be reachable 
> multiple times, and the memory typically needs to be freshly initialized each 
> time.  Using `InitTempAlloca` is therefore frequently wrong, and I'm honestly 
> not sure there's any good reason to use it.  Looking at the calls, some of 
> them know that they're in the prologue, and so it should be fine to simply 
> emit a store normally.  Other calls seem quite suspect, like the one in 
> CGObjCGNU.cpp.  And even if it's semantically okay, it's potentially doing 
> unnecessary work up-front when it only really needs to happen if that path is 
> taken.
>
> So I don't really care about aesthetic differences in the code created by 
> `InitTempAlloca`, because we should just remove it completely.
>
> If we really care about not interleaving things with the `alloca` sequence — 
> and right now I am not convinced that we do, because contra the comments and 
> description of this patch, this is not an LLVM requirement of any sort — I 
> think we should lazily create a second InsertPt instruction after the 
> `AllocaInsertPt` and insert all the secondary instruction prior to that so 
> that they appear in source order.

Agree. I will give it a try to make changes as you suggest, though it may take 
some time since it requires a bit of cleanup and handling of the changes to 
(possibly many) lit tests as a side effect.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-10-06 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 377742.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,6 +12,8 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -5,31 +5,31 @@
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL2]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspac

[PATCH] D110676: [CFE][Codegen] Update auto-generated check lines for few GPU lit tests

2021-10-06 Thread Mahesha S via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG393581d8a5cb: [CFE][Codegen] Update auto-generated check 
lines for few GPU lit tests (authored by hsmhsm).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110676/new/

https://reviews.llvm.org/D110676

Files:
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp

Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,73 +1,129 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-// CHECK:@_ZZ4testvE3foo = internal addrspace(1) constant i32 66, align 4
-// CHECK: @[[STR:[.a-zA-Z0-9_]+]] = private unnamed_addr addrspace(1) constant [14 x i8] c"Hello, world!\00", align 1
 
-// CHECK-LABEL: @_Z4testv
+// CHECK-LABEL: @_Z4testv(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL2_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL2]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:store i32 0, i32 addrspace(4)* [[I_ASCAST]], align 4
+// CHECK-NEXT:store i32 addrspace(4)* [[I_ASCAST]], i32 addrspace(4)* addrspace(4)* [[PPTR_ASCAST]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load i32 addrspace(4)*, i32 addrspace(4)* addrspace(4)* [[PPTR_ASCAST]], align 8
+// CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 addrspace(4)* [[TMP0]], [[I_ASCAST]]
+// CHECK-NEXT:[[FROMBOOL:%.*]] = zext i1 [[CMP]] to i8
+// CHECK-NEXT:store i8 [[FROMBOOL]], i8 addrspace(4)* [[IS_I_PTR_ASCAST]], align 1
+// CHECK-NEXT:[[TMP1:%.*]] = load i32 addrspace(4)*, i32 addrspace(4)* addrspace(4)* [[PPTR_ASCAST]], align 8
+// CHECK-NEXT:store i32 66, i32 addrspace(4)* [[TMP1]], align 4
+// CHECK-NEXT:store i32 23, i32 addrspace(4)* [[VAR23_ASCAST]], align 4
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast i32 addrspace(4)* [[VAR23_ASCAST]] to i8 addrspace(4)*
+// CHECK-NEXT:store i8 addrspace(4)* [[TMP2]], i8 addrspace(4)* addrspace(4)* [[CP_ASCAST]], align 8
+// CHECK-NEXT:[[TMP3:%.*]] = load i8 addrspace(4)*, i8 addrspace(4)* addrspace(4)* [[CP_ASCAST]], align 8
+// CHECK-NEXT:store i8 41, i8 addrspace(4)* [[TMP3]], align 1
+// CHECK-NEXT:[[ARRAYDECAY:%.*]] = getelementptr inbounds [42 x i32], [42 x i32] a

[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-10-04 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110676: [CFE][Codegen] Update auto-generated check lines for few GPU lit tests

2021-10-04 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110676/new/

https://reviews.llvm.org/D110676

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-28 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 375789.
hsmhsm added a comment.

Fix review comments by @jdoerfert.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp

Index: clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
===
--- clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
+++ clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
@@ -12,6 +12,8 @@
   int arr[N];
 
   // CHECK:  [[VAR_ADDR:%.+]] = alloca [100 x i32]*, align 8, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR:%.+]] = alloca i32, align 4, addrspace(5)
+  // CHECK-NEXT: [[VAR2_ADDR_CAST:%.+]] = addrspacecast i32 addrspace(5)* [[VAR2_ADDR]] to i32*
   // CHECK-NEXT: [[VAR_ADDR_CAST:%.+]] = addrspacecast [100 x i32]* addrspace(5)* [[VAR_ADDR]] to [100 x i32]**
   // CHECK:  store [100 x i32]* [[VAR:%.+]], [100 x i32]** [[VAR_ADDR_CAST]], align 8
 
Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -5,31 +5,31 @@
 // CHECK-LABEL: @_Z4testv(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
 // CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
-// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
 // CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
-// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
 // CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
-// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
 // CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
-// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
-// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
 // CHECK-NEXT:[[SELECT_STR_TRIVIAL2_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL2]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[VA

[PATCH] D110676: [CFE][Codegen] Update auto-generated check lines for few GPU lit tests

2021-09-28 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 375788.
hsmhsm added a comment.

Only update lit tests which undergo changes within 
https://reviews.llvm.org/D110257


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110676/new/

https://reviews.llvm.org/D110676

Files:
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp

Index: clang/test/CodeGenSYCL/address-space-deduction.cpp
===
--- clang/test/CodeGenSYCL/address-space-deduction.cpp
+++ clang/test/CodeGenSYCL/address-space-deduction.cpp
@@ -1,73 +1,129 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple spir64 -fsycl-is-device -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
 
-// CHECK:@_ZZ4testvE3foo = internal addrspace(1) constant i32 66, align 4
-// CHECK: @[[STR:[.a-zA-Z0-9_]+]] = private unnamed_addr addrspace(1) constant [14 x i8] c"Hello, world!\00", align 1
 
-// CHECK-LABEL: @_Z4testv
+// CHECK-LABEL: @_Z4testv(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[I:%.*]] = alloca i32, align 4
+// CHECK-NEXT:[[I_ASCAST:%.*]] = addrspacecast i32* [[I]] to i32 addrspace(4)*
+// CHECK-NEXT:[[PPTR:%.*]] = alloca i32 addrspace(4)*, align 8
+// CHECK-NEXT:[[PPTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[PPTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[IS_I_PTR:%.*]] = alloca i8, align 1
+// CHECK-NEXT:[[IS_I_PTR_ASCAST:%.*]] = addrspacecast i8* [[IS_I_PTR]] to i8 addrspace(4)*
+// CHECK-NEXT:[[VAR23:%.*]] = alloca i32, align 4
+// CHECK-NEXT:[[VAR23_ASCAST:%.*]] = addrspacecast i32* [[VAR23]] to i32 addrspace(4)*
+// CHECK-NEXT:[[CP:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[CP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[ARR:%.*]] = alloca [42 x i32], align 4
+// CHECK-NEXT:[[ARR_ASCAST:%.*]] = addrspacecast [42 x i32]* [[ARR]] to [42 x i32] addrspace(4)*
+// CHECK-NEXT:[[CPP:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[CPP_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[CPP]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[APTR:%.*]] = alloca i32 addrspace(4)*, align 8
+// CHECK-NEXT:[[APTR_ASCAST:%.*]] = addrspacecast i32 addrspace(4)** [[APTR]] to i32 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[STR:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[STR]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[PHI_STR:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[PHI_STR_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[PHI_STR]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_NULL:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[SELECT_NULL_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_NULL]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL1:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL1_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL1]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL2:%.*]] = alloca i8 addrspace(4)*, align 8
+// CHECK-NEXT:[[SELECT_STR_TRIVIAL2_ASCAST:%.*]] = addrspacecast i8 addrspace(4)** [[SELECT_STR_TRIVIAL2]] to i8 addrspace(4)* addrspace(4)*
+// CHECK-NEXT:store i32 0, i32 addrspace(4)* [[I_ASCAST]], align 4
+// CHECK-NEXT:store i32 addrspace(4)* [[I_ASCAST]], i32 addrspace(4)* addrspace(4)* [[PPTR_ASCAST]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load i32 addrspace(4)*, i32 addrspace(4)* addrspace(4)* [[PPTR_ASCAST]], align 8
+// CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 addrspace(4)* [[TMP0]], [[I_ASCAST]]
+// CHECK-NEXT:[[FROMBOOL:%.*]] = zext i1 [[CMP]] to i8
+// CHECK-NEXT:store i8 [[FROMBOOL]], i8 addrspace(4)* [[IS_I_PTR_ASCAST]], align 1
+// CHECK-NEXT:[[TMP1:%.*]] = load i32 addrspace(4)*, i32 addrspace(4)* addrspace(4)* [[PPTR_ASCAST]], align 8
+// CHECK-NEXT:store i32 66, i32 addrspace(4)* [[TMP1]], align 4
+// CHECK-NEXT:store i32 23, i32 addrspace(4)* [[VAR23_ASCAST]], align 4
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast i32 addrspace(4)* [[VAR23_ASCAST]] to i8 addrspace(4)*
+// CHECK-NEXT:store i8 addrspace(4)* [[TMP2]], i8 addrspace(4)* addrspace(4)* [[CP_ASCAST]], align 8
+// CHECK-NEXT:[[TMP3:%.*]] = load i8 addrspace(4)*, i8 addrspace(4)* addrspace(4)* [[CP_ASCAST]], align 8
+// CHECK-NEXT:store i8 41, i8 addrspace(4)* [[TMP3]], align 1
+// CHECK-NEXT:[[ARRAYDECAY:%.*]] = getelementptr inbounds [42 x i32], [42 x i32] addrspace(4)* [[ARR_ASCAST]], i64 0, i64 0
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast i32 addrspace(4)*

[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-28 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added inline comments.



Comment at: clang/lib/CodeGen/CGExpr.cpp:151
+  // allocas.
+  Builder.CreateStore(Init, Var);
 }

jdoerfert wrote:
> hsmhsm wrote:
> > jdoerfert wrote:
> > > I'm not even sure this is necessarily correct. 
> > > 
> > > How do we know the new store is not inside a loop and might write the 
> > > value more than once, potentially overwriting a value set later?
> > > 
> > > Aside from that (important) question, you need to update the 
> > > documentation of the function. It doesn't correspond to the new semantics 
> > > anymore.
> > > 
> > I am not convinced with above comments -Because, this code neither changes 
> > the address nor the initializer 
> > (https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CodeGenFunction.h#L2548),
> >  but only a place where the initialization happens.
> > 
> > Further, as the documentation says, the initializer must be a constant or 
> > function argument (and surprisingly, I do not see any assertion about it). 
> > Hence, even if the initialization happens within the loop, loop invariant 
> > code motion pass should detect it.
> > 
> > That said, if we think, it is better to keep the initialization within 
> > entry block, we can do it at the end of the block.
> What the initial value is is irrelevant. Arguing LICM should take care of it 
> is also not helpful. Changing the location of the initialization is in itself 
> a problem:
> 
> Before:
> ```
> a = alloca
> a = 0;// init is in the entry block
> for (...) {
>   use(a);
>   a = a + 1;
> }
> ```
> 
> After, potentially:
> ```
> a = alloca
> for (...) {
>   a = 0;  // init is now at the builder insertion point
>   use(a);
>   a = a + 1;
> }
> ```
Clarification - I am not trying to argue anything here with anybody, I am 
trying to defend myself, where I feel I am right as per my knowledge. If I 
realize that I am at false later, then I correct myself, but I am not arguing, 
and will never ever do.

Ok, I will try to fix it from the practical point of view.

But,  I still think as follows - From theoretical/good programming/good 
front-end code generation perspective, initialization of something will never 
happen within construct like loop, otherwise initialization has no meaning at 
all.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-28 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

In D110257#3027633 , @jdoerfert wrote:

> While I understand people are eager to receive feedback on their patches, it 
> is not helpful to ping/remind the reviewers constantly.
> This does generate noise for them and can consequently also reduce their 
> interest in a patch. The recommendation for time without
> review before a "ping" is send is still one week.

Agreed.




Comment at: clang/lib/CodeGen/CGExpr.cpp:151
+  // allocas.
+  Builder.CreateStore(Init, Var);
 }

jdoerfert wrote:
> I'm not even sure this is necessarily correct. 
> 
> How do we know the new store is not inside a loop and might write the value 
> more than once, potentially overwriting a value set later?
> 
> Aside from that (important) question, you need to update the documentation of 
> the function. It doesn't correspond to the new semantics anymore.
> 
I am not convinced with above comments -Because, this code neither changes the 
address nor the initializer 
(https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CodeGenFunction.h#L2548),
 but only a place where the initialization happens.

Further, as the documentation says, the initializer must be a constant or 
function argument (and surprisingly, I do not see any assertion about it). 
Hence, even if the initialization happens within the loop, loop invariant code 
motion pass should detect it.

That said, if we think, it is better to keep the initialization within entry 
block, we can do it at the end of the block.



Comment at: clang/test/CodeGenCUDA/builtins-amdgcn.cu:1
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx906 -x hip \

jdoerfert wrote:
> Please prepare a pre-commit that adds auto-generated check lines. Adding them 
> as part of a commit makes it impossible to see the difference.
Will do


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-28 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

@yaxunl  / @jdoerfert / @tra

Can I expect your further comment/decision on this patch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-27 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-27 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 375472.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-24 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added inline comments.



Comment at: clang/lib/CodeGen/CGExpr.cpp:103
+if (!ArraySize) {
+  auto *EBB = AllocaInsertPt->getParent();
+  auto Iter = AllocaInsertPt->getIterator();

rnk wrote:
> hsmhsm wrote:
> > jdoerfert wrote:
> > > arsenm wrote:
> > > > Why is there a special AllocaInsertPt iterator in the first place? Can 
> > > > you avoid any iteration logic by just always inserting at the block 
> > > > start?
> > > Right. The alloca insertion point is sometimes changed to a non-entry 
> > > block, and we should keep that ability.
> > > From all the use cases I know it would suffice to insert at the beginning 
> > > of the alloca insertion point block though.
> > I really do not understand this comment fully.
> > 
> > This block of code here inserts an "addressspace cast" of recently inserted 
> > alloca, not the alloca itself. Alloca is already inserted.  Please look at 
> > the start of this function. 
> > 
> > The old logic (in the left) inserts addressspace cast of recently inserted 
> > alloca immediately after recent alloca using AllocaInsertPt. As a side 
> > effect, it also makes AllocaInsertPt now point to this newly inserted 
> > addressspace cast. Hence, next alloca is inserted after this new 
> > addressspace cast, because now AllocaInsertPt is made to point this newly 
> > inserted addressspace cast.  
> > 
> > The new logic (here) fixes that by moving insertion point just beyond 
> > current AllocaInsertPt without updating AllocaInsertPt.
> > 
> > How can I insert "addressspace cast" of an alloca, at the beginning of the 
> > block even before alloca?
> > 
> > As I understand it, AllocaInsertPt is maintained to insert static allocas 
> > at the start of entry block, otherwise there is no any special reason to 
> > maintain such a special insertion point. Please look at 
> > https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CodeGenFunction.h#L378.
> > 
> > That said, I am not really sure, if I have completely misunderstood the 
> > comment above. If that is the case, then, I need better clarification here 
> > about what really is expected.
> Well, inserting at the top of the entry block would reverse the order of the 
> allocas. Currently they appear in source/IRGen order, which is nice. 
> Maintaining the order requires appending, which requires a cursor of some 
> kind.
Yes, correct. 

And I am waiting for further inputs from @yaxunl / @arsenm / @jdoerfert 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-24 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 374869.
hsmhsm added a comment.

Remove "!llvm.access.group" metadata from check lines in test files.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-23 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added inline comments.



Comment at: clang/lib/CodeGen/CGExpr.cpp:103
+if (!ArraySize) {
+  auto *EBB = AllocaInsertPt->getParent();
+  auto Iter = AllocaInsertPt->getIterator();

jdoerfert wrote:
> arsenm wrote:
> > Why is there a special AllocaInsertPt iterator in the first place? Can you 
> > avoid any iteration logic by just always inserting at the block start?
> Right. The alloca insertion point is sometimes changed to a non-entry block, 
> and we should keep that ability.
> From all the use cases I know it would suffice to insert at the beginning of 
> the alloca insertion point block though.
I really do not understand this comment fully.

This block of code here inserts an "addressspace cast" of recently inserted 
alloca, not the alloca itself. Alloca is already inserted.  Please look at the 
start of this function. 

The old logic (in the left) inserts addressspace cast of recently inserted 
alloca immediately after recent alloca using AllocaInsertPt. As a side effect, 
it also makes AllocaInsertPt now point to this newly inserted addressspace 
cast. Hence, next alloca is inserted after this new addressspace cast, because 
now AllocaInsertPt is made to point this newly inserted addressspace cast.  

The new logic (here) fixes that by moving insertion point just beyond current 
AllocaInsertPt without updating AllocaInsertPt.

How can I insert "addressspace cast" of an alloca, at the beginning of the 
block even before alloca?

As I understand it, AllocaInsertPt is maintained to insert static allocas at 
the start of entry block, otherwise there is no any special reason to maintain 
such a special insertion point. Please look at 
https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CodeGenFunction.h#L378.

That said, I am not really sure, if I have completely misunderstood the comment 
above. If that is the case, then, I need better clarification here about what 
really is expected.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-23 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added inline comments.



Comment at: clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp:264
 // CHECK1:   omp.inner.for.cond:
-// CHECK1-NEXT:[[TMP7:%.*]] = load i32, i32* [[DOTOMP_IV]], align 4
-// CHECK1-NEXT:[[TMP8:%.*]] = load i32, i32* [[DOTOMP_UB]], align 4
+// CHECK1-NEXT:[[TMP7:%.*]] = load i32, i32* [[DOTOMP_IV]], align 4, 
!llvm.access.group !15
+// CHECK1-NEXT:[[TMP8:%.*]] = load i32, i32* [[DOTOMP_UB]], align 4, 
!llvm.access.group !15

hsmhsm wrote:
> yaxunl wrote:
> > Is the test updated by a script? If the original test does not check 
> > !llvm.access.group, the updated test should not check it either. This makes 
> > the test less stable.
> Yes, most of the tests here are updated by script only. Probably I might have 
> missed few command line options to the script. I have not passed any option 
> to the script while updating it. Let me check.
I have used exact command line as in the test file plus extra option 
"--force-update". But it's presence or absence is not making any difference 
here. 

But the !llvm.access.group metadata is newly added. And I am not finding anyway 
of disabling it.

Do you have any idea? or anyone else for that matter? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-23 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added inline comments.



Comment at: clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp:264
 // CHECK1:   omp.inner.for.cond:
-// CHECK1-NEXT:[[TMP7:%.*]] = load i32, i32* [[DOTOMP_IV]], align 4
-// CHECK1-NEXT:[[TMP8:%.*]] = load i32, i32* [[DOTOMP_UB]], align 4
+// CHECK1-NEXT:[[TMP7:%.*]] = load i32, i32* [[DOTOMP_IV]], align 4, 
!llvm.access.group !15
+// CHECK1-NEXT:[[TMP8:%.*]] = load i32, i32* [[DOTOMP_UB]], align 4, 
!llvm.access.group !15

yaxunl wrote:
> Is the test updated by a script? If the original test does not check 
> !llvm.access.group, the updated test should not check it either. This makes 
> the test less stable.
Yes, most of the tests here are updated by script only. Probably I might have 
missed few command line options to the script. I have not passed any option to 
the script while updating it. Let me check.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-23 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 374486.
hsmhsm added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-22 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added inline comments.



Comment at: clang/lib/CodeGen/CGExpr.cpp:102-106
+  auto *EBB = AllocaInsertPt->getParent();
+  auto Iter = AllocaInsertPt->getIterator();
+  if (Iter != EBB->end())
+++Iter;
+  Builder.SetInsertPoint(EBB, Iter);

arsenm wrote:
> hsmhsm wrote:
> > arsenm wrote:
> > > Where are the addrspacecasts inserted? Could you just adjust where those 
> > > are inserted instead?
> > The addressspace casts are inserted immediately after all static allocas 
> > (top static alloca cluster).
> > 
> > An example:
> > 
> > Before this patch:
> > 
> > ```
> > entry:
> >   %N.addr = alloca i64, align 8, addrspace(5)
> >   %N.addr.ascast = addrspacecast i64 addrspace(5)* %N.addr to i64*
> >   %vla.addr = alloca i64, align 8, addrspace(5)
> >   %vla.addr.ascast = addrspacecast i64 addrspace(5)* %vla.addr to i64*
> >   %a.addr = alloca i32*, align 8, addrspace(5)
> >   %a.addr.ascast = addrspacecast i32* addrspace(5)* %a.addr to i32**
> >   %vla.addr2 = alloca i64, align 8, addrspace(5)
> >   %vla.addr2.ascast = addrspacecast i64 addrspace(5)* %vla.addr2 to i64*
> >   %b.addr = alloca i32*, align 8, addrspace(5)
> >   %b.addr.ascast = addrspacecast i32* addrspace(5)* %b.addr to i32**
> >   %N.casted = alloca i64, align 8, addrspace(5)
> >   %N.casted.ascast = addrspacecast i64 addrspace(5)* %N.casted to i64*
> >   %.zero.addr = alloca i32, align 4, addrspace(5)
> >   %.zero.addr.ascast = addrspacecast i32 addrspace(5)* %.zero.addr to i32*
> >   %.threadid_temp. = alloca i32, align 4, addrspace(5)
> >   %.threadid_temp..ascast = addrspacecast i32 addrspace(5)* 
> > %.threadid_temp. to i32*  
> >   store i64 %N, i64* %N.addr.ascast, align 8
> > ```
> > 
> > With this patch:
> > 
> > ```
> > entry:
> >   %N.addr = alloca i64, align 8, addrspace(5)
> >   %vla.addr = alloca i64, align 8, addrspace(5)
> >   %a.addr = alloca i32*, align 8, addrspace(5)
> >   %vla.addr2 = alloca i64, align 8, addrspace(5)
> >   %b.addr = alloca i32*, align 8, addrspace(5)
> >   %N.casted = alloca i64, align 8, addrspace(5)
> >   %.zero.addr = alloca i32, align 4, addrspace(5)
> >   %.threadid_temp. = alloca i32, align 4, addrspace(5)
> >   %.threadid_temp..ascast = addrspacecast i32 addrspace(5)* 
> > %.threadid_temp. to i32*
> >   %.zero.addr.ascast = addrspacecast i32 addrspace(5)* %.zero.addr to i32*
> >   %N.casted.ascast = addrspacecast i64 addrspace(5)* %N.casted to i64*
> >   %b.addr.ascast = addrspacecast i32* addrspace(5)* %b.addr to i32**
> >   %vla.addr2.ascast = addrspacecast i64 addrspace(5)* %vla.addr2 to i64*
> >   %a.addr.ascast = addrspacecast i32* addrspace(5)* %a.addr to i32**
> >   %vla.addr.ascast = addrspacecast i64 addrspace(5)* %vla.addr to i64*
> >   %N.addr.ascast = addrspacecast i64 addrspace(5)* %N.addr to i64*
> >   store i64 %N, i64* %N.addr.ascast, align 8
> > ```
> I meant where in the clang code are these emitted, and how is that I set 
> point found?
I understand your question as - where exactly is code emitted? 

It is Clang Codegen part, the Clang Codegen maintains a builder (function 
specific?) which builds and emits code, and builder always maintains default 
current insertion position, and it can also be asked to insert at some other 
place by calling the api SetInsertPoint().

In this particular case, the addrespace casts are  emitted at (by calling the 
builder) 
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/TargetInfo.cpp#445

All I am doing here is - direct the builder to insert addrespace casts just 
after all static allocas by appropriately setting the insert position via 
SetInsertPoint().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-22 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 374419.
hsmhsm added a comment.

Update source comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-22 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

In D110257#3016113 , @rnk wrote:

> In D110257#3015641 , @hsmhsm wrote:
>
>> The logic at 
>> https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/InlineFunction.cpp#L2093
>>  assumes that static allocas (within callee) are contiguous.
>
> No, it doesn't make that assumption. That code is an attempt to optimize the 
> linked list splicing, so that batches of contiguous allocas can be spliced 
> over together. I believe the code would still be correct if you removed this 
> inner loop and left behind the outer loop, which iterates over all allocas in 
> the entry block.

Thanks, understood it now. So, the actual logic (in the outer loop) is - scan 
the (contiguous ) batches of static allocas within the callees entry block, and 
move those batches to caller as one chunk at a time.

That said, it is still good idea (and actually an explicitly not mandated 
requirement) to maintain the contiguity of the static allocas at the top of the 
basic block as one cluster, and it should start from FE itself.  So, this patch 
is still relevant.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-22 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

In D110257#3015553 , @arsenm wrote:

> I do think it's cleaner/more canonical IR to cluster these at the top of the 
> block, but I don't understand this comment:
>
>> otherwise, inliner's attempt to move static allocas from callee to caller 
>> will fail,
>
> The inliner successfully moves allocas to the caller's entry block, even with 
> addrspacecasts interspersed.

The logic at 
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/InlineFunction.cpp#L2093
 assumes that static allocas (within callee) are contiguous.




Comment at: clang/lib/CodeGen/CGExpr.cpp:102-106
+  auto *EBB = AllocaInsertPt->getParent();
+  auto Iter = AllocaInsertPt->getIterator();
+  if (Iter != EBB->end())
+++Iter;
+  Builder.SetInsertPoint(EBB, Iter);

arsenm wrote:
> Where are the addrspacecasts inserted? Could you just adjust where those are 
> inserted instead?
The addressspace casts are inserted immediately after all static allocas (top 
static alloca cluster).

An example:

Before this patch:

```
entry:
  %N.addr = alloca i64, align 8, addrspace(5)
  %N.addr.ascast = addrspacecast i64 addrspace(5)* %N.addr to i64*
  %vla.addr = alloca i64, align 8, addrspace(5)
  %vla.addr.ascast = addrspacecast i64 addrspace(5)* %vla.addr to i64*
  %a.addr = alloca i32*, align 8, addrspace(5)
  %a.addr.ascast = addrspacecast i32* addrspace(5)* %a.addr to i32**
  %vla.addr2 = alloca i64, align 8, addrspace(5)
  %vla.addr2.ascast = addrspacecast i64 addrspace(5)* %vla.addr2 to i64*
  %b.addr = alloca i32*, align 8, addrspace(5)
  %b.addr.ascast = addrspacecast i32* addrspace(5)* %b.addr to i32**
  %N.casted = alloca i64, align 8, addrspace(5)
  %N.casted.ascast = addrspacecast i64 addrspace(5)* %N.casted to i64*
  %.zero.addr = alloca i32, align 4, addrspace(5)
  %.zero.addr.ascast = addrspacecast i32 addrspace(5)* %.zero.addr to i32*
  %.threadid_temp. = alloca i32, align 4, addrspace(5)
  %.threadid_temp..ascast = addrspacecast i32 addrspace(5)* %.threadid_temp. to 
i32*  
  store i64 %N, i64* %N.addr.ascast, align 8
```

With this patch:

```
entry:
  %N.addr = alloca i64, align 8, addrspace(5)
  %vla.addr = alloca i64, align 8, addrspace(5)
  %a.addr = alloca i32*, align 8, addrspace(5)
  %vla.addr2 = alloca i64, align 8, addrspace(5)
  %b.addr = alloca i32*, align 8, addrspace(5)
  %N.casted = alloca i64, align 8, addrspace(5)
  %.zero.addr = alloca i32, align 4, addrspace(5)
  %.threadid_temp. = alloca i32, align 4, addrspace(5)
  %.threadid_temp..ascast = addrspacecast i32 addrspace(5)* %.threadid_temp. to 
i32*
  %.zero.addr.ascast = addrspacecast i32 addrspace(5)* %.zero.addr to i32*
  %N.casted.ascast = addrspacecast i64 addrspace(5)* %N.casted to i64*
  %b.addr.ascast = addrspacecast i32* addrspace(5)* %b.addr to i32**
  %vla.addr2.ascast = addrspacecast i64 addrspace(5)* %vla.addr2 to i64*
  %a.addr.ascast = addrspacecast i32* addrspace(5)* %a.addr to i32**
  %vla.addr.ascast = addrspacecast i64 addrspace(5)* %vla.addr to i64*
  %N.addr.ascast = addrspacecast i64 addrspace(5)* %N.addr to i64*
  store i64 %N, i64* %N.addr.ascast, align 8
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not break the contiguity of static allocas.

2021-09-22 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 374252.
hsmhsm added a comment.

Update source comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110257/new/

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110257: [CFE][Codegen] Do not the break contiguity of static allocas.

2021-09-22 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm created this revision.
hsmhsm added reviewers: arsenm, rampitec, jdoerfert, lebedev.ri, nhaehnle, 
rjmccall, yaxunl, AndreyChurbanov.
Herald added a subscriber: jvesely.
hsmhsm requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1, wdng.
Herald added a project: clang.

Do not break the contiguity of static allocas by inserting addressspace
casts in between static allocas, otherwise, inliner's attempt to move
static allocas from callee to caller will fail, and which in turn will
have serious side effects on code transformation/optimzation.

Make sure that all addressspace casts of static allocas are inserted just
after all static allocas.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D110257

Files:
  clang/lib/CodeGen/CGExpr.cpp
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp
  clang/test/CodeGenCXX/amdgcn-func-arg.cpp
  clang/test/CodeGenCXX/builtin-amdgcn-atomic-inc-dec.cpp
  clang/test/CodeGenCXX/vla.cpp
  clang/test/CodeGenSYCL/address-space-deduction.cpp
  clang/test/OpenMP/amdgcn_target_init_temp_alloca.cpp
  clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen.cpp
  clang/test/OpenMP/parallel_if_codegen_PR51349.cpp
  clang/test/OpenMP/parallel_master_taskloop_codegen.cpp
  clang/test/OpenMP/parallel_master_taskloop_simd_codegen.cpp
  clang/test/OpenMP/target_codegen_global_capture.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/task_codegen.c
  clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp
  clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp
  clang/test/OpenMP/vla_crash.c

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-03 Thread Mahesha S via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGcac068600e55: [HIP] Make sure, unused hip-pinned-shadow 
global var is kept within device code (authored by hsmhsm).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75402/new/

https://reviews.llvm.org/D75402

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/test/CodeGenCUDA/hip-pinned-shadow.cu


Index: clang/test/CodeGenCUDA/hip-pinned-shadow.cu
===
--- clang/test/CodeGenCUDA/hip-pinned-shadow.cu
+++ clang/test/CodeGenCUDA/hip-pinned-shadow.cu
@@ -4,6 +4,8 @@
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEV %s
 // RUN: %clang_cc1 -triple x86_64 -std=c++11 \
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPHOST %s
+// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -std=c++11 -fvisibility 
hidden -fapply-global-visibility-to-externs \
+// RUN: -O3 -emit-llvm -o - -x hip %s | FileCheck 
-check-prefixes=HIPDEVUNSED %s
 
 struct textureReference {
   int a;
@@ -21,3 +23,5 @@
 // HIPDEV-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  define{{.*}}@_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  call i32 @__hipRegisterVar{{.*}}@tex{{.*}}i32 0, i32 4, i32 0, 
i32 0)
+// HIPDEVUNSED: @tex = external addrspace(1) global %struct.texture
+// HIPDEVUNSED-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
Index: clang/lib/CodeGen/CodeGenModule.h
===
--- clang/lib/CodeGen/CodeGenModule.h
+++ clang/lib/CodeGen/CodeGenModule.h
@@ -1037,7 +1037,7 @@
   void MaybeHandleStaticInExternC(const SomeDecl *D, llvm::GlobalValue *GV);
 
   /// Add a global to a list to be added to the llvm.used metadata.
-  void addUsedGlobal(llvm::GlobalValue *GV);
+  void addUsedGlobal(llvm::GlobalValue *GV, bool SkipCheck = false);
 
   /// Add a global to a list to be added to the llvm.compiler.used metadata.
   void addCompilerUsedGlobal(llvm::GlobalValue *GV);
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -1916,9 +1916,9 @@
   }
 }
 
-void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV) {
-  assert(!GV->isDeclaration() &&
- "Only globals with definition can force usage.");
+void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV, bool SkipCheck) {
+  assert(SkipCheck || (!GV->isDeclaration() &&
+   "Only globals with definition can force usage."));
   LLVMUsed.emplace_back(GV);
 }
 
@@ -4071,9 +4071,17 @@
 }
   }
 
-  if (!IsHIPPinnedShadowVar)
+  // HIPPinnedShadowVar should remain in the final code object irrespective of
+  // whether it is used or not within the code. Add it to used list, so that
+  // it will not get eliminated when it is unused. Also, it is an extern var
+  // within device code, and it should *not* get initialized within device 
code.
+  if (IsHIPPinnedShadowVar)
+addUsedGlobal(GV, /*SkipCheck=*/true);
+  else
 GV->setInitializer(Init);
-  if (emitter) emitter->finalize(GV);
+
+  if (emitter)
+emitter->finalize(GV);
 
   // If it is safe to mark the global 'constant', do so now.
   GV->setConstant(!NeedsGlobalCtor && !NeedsGlobalDtor &&


Index: clang/test/CodeGenCUDA/hip-pinned-shadow.cu
===
--- clang/test/CodeGenCUDA/hip-pinned-shadow.cu
+++ clang/test/CodeGenCUDA/hip-pinned-shadow.cu
@@ -4,6 +4,8 @@
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEV %s
 // RUN: %clang_cc1 -triple x86_64 -std=c++11 \
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPHOST %s
+// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -std=c++11 -fvisibility hidden -fapply-global-visibility-to-externs \
+// RUN: -O3 -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEVUNSED %s
 
 struct textureReference {
   int a;
@@ -21,3 +23,5 @@
 // HIPDEV-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  define{{.*}}@_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  call i32 @__hipRegisterVar{{.*}}@tex{{.*}}i32 0, i32 4, i32 0, i32 0)
+// HIPDEVUNSED: @tex = external addrspace(1) global %struct.texture
+// HIPDEVUNSED-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
Index: clang/lib/CodeGen/CodeGenModule.h
===
--- clang/lib/CodeGen/CodeGenModule.h
+++ clang/lib/CodeGen/CodeGenModule.h
@@ -1037,7 +1037,7 @@
   void MaybeHandleStaticInExternC(const SomeDecl *D, llvm::GlobalValue *GV);
 
   /// Add a global to a list to be added to the llvm.used metadata.
-  void addUsedGlobal(llvm::GlobalValue *GV);
+  void addUsedGlobal(llvm::GlobalValue *GV, bool SkipCheck = false);
 
   /// Add a global to a list to be added to the 

[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-03 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

In D75402#1901471 , @hsmhsm wrote:

> Take care review comments by hliao.


ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75402/new/

https://reviews.llvm.org/D75402



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-02 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm updated this revision to Diff 247689.
hsmhsm added a comment.

Take care review comments by hliao.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75402/new/

https://reviews.llvm.org/D75402

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/test/CodeGenCUDA/hip-pinned-shadow.cu


Index: clang/test/CodeGenCUDA/hip-pinned-shadow.cu
===
--- clang/test/CodeGenCUDA/hip-pinned-shadow.cu
+++ clang/test/CodeGenCUDA/hip-pinned-shadow.cu
@@ -4,6 +4,8 @@
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEV %s
 // RUN: %clang_cc1 -triple x86_64 -std=c++11 \
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPHOST %s
+// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -std=c++11 -fvisibility 
hidden -fapply-global-visibility-to-externs \
+// RUN: -O3 -emit-llvm -o - -x hip %s | FileCheck 
-check-prefixes=HIPDEVUNSED %s
 
 struct textureReference {
   int a;
@@ -21,3 +23,5 @@
 // HIPDEV-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  define{{.*}}@_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  call i32 @__hipRegisterVar{{.*}}@tex{{.*}}i32 0, i32 4, i32 0, 
i32 0)
+// HIPDEVUNSED: @tex = external addrspace(1) global %struct.texture
+// HIPDEVUNSED-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
Index: clang/lib/CodeGen/CodeGenModule.h
===
--- clang/lib/CodeGen/CodeGenModule.h
+++ clang/lib/CodeGen/CodeGenModule.h
@@ -1037,7 +1037,7 @@
   void MaybeHandleStaticInExternC(const SomeDecl *D, llvm::GlobalValue *GV);
 
   /// Add a global to a list to be added to the llvm.used metadata.
-  void addUsedGlobal(llvm::GlobalValue *GV);
+  void addUsedGlobal(llvm::GlobalValue *GV, bool SkipCheck = false);
 
   /// Add a global to a list to be added to the llvm.compiler.used metadata.
   void addCompilerUsedGlobal(llvm::GlobalValue *GV);
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -1916,9 +1916,9 @@
   }
 }
 
-void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV) {
-  assert(!GV->isDeclaration() &&
- "Only globals with definition can force usage.");
+void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV, bool SkipCheck) {
+  assert(SkipCheck || (!GV->isDeclaration() &&
+   "Only globals with definition can force usage."));
   LLVMUsed.emplace_back(GV);
 }
 
@@ -4071,9 +4071,17 @@
 }
   }
 
-  if (!IsHIPPinnedShadowVar)
+  // HIPPinnedShadowVar should remain in the final code object irrespective of
+  // whether it is used or not within the code. Add it to used list, so that
+  // it will not get eliminated when it is unused. Also, it is an extern var
+  // within device code, and it should *not* get initialized within device 
code.
+  if (IsHIPPinnedShadowVar)
+addUsedGlobal(GV, /*SkipCheck=*/true);
+  else
 GV->setInitializer(Init);
-  if (emitter) emitter->finalize(GV);
+
+  if (emitter)
+emitter->finalize(GV);
 
   // If it is safe to mark the global 'constant', do so now.
   GV->setConstant(!NeedsGlobalCtor && !NeedsGlobalDtor &&


Index: clang/test/CodeGenCUDA/hip-pinned-shadow.cu
===
--- clang/test/CodeGenCUDA/hip-pinned-shadow.cu
+++ clang/test/CodeGenCUDA/hip-pinned-shadow.cu
@@ -4,6 +4,8 @@
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEV %s
 // RUN: %clang_cc1 -triple x86_64 -std=c++11 \
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPHOST %s
+// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -std=c++11 -fvisibility hidden -fapply-global-visibility-to-externs \
+// RUN: -O3 -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEVUNSED %s
 
 struct textureReference {
   int a;
@@ -21,3 +23,5 @@
 // HIPDEV-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  define{{.*}}@_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  call i32 @__hipRegisterVar{{.*}}@tex{{.*}}i32 0, i32 4, i32 0, i32 0)
+// HIPDEVUNSED: @tex = external addrspace(1) global %struct.texture
+// HIPDEVUNSED-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
Index: clang/lib/CodeGen/CodeGenModule.h
===
--- clang/lib/CodeGen/CodeGenModule.h
+++ clang/lib/CodeGen/CodeGenModule.h
@@ -1037,7 +1037,7 @@
   void MaybeHandleStaticInExternC(const SomeDecl *D, llvm::GlobalValue *GV);
 
   /// Add a global to a list to be added to the llvm.used metadata.
-  void addUsedGlobal(llvm::GlobalValue *GV);
+  void addUsedGlobal(llvm::GlobalValue *GV, bool SkipCheck = false);
 
   /// Add a global to a list to be added to the llvm.compiler.used metadata.
   void addCompilerUsedGlobal(llvm::GlobalValue *GV);
Index: clang/lib/Cod

[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-02 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm added a comment.

In D75402#1901361 , @hliao wrote:

> BTW, why that variable cannot have an initializer? Suppose that initializer 
> is a trivial one, initializing to 0, would that cause any issue in the 
> compilation?


Initialization makes the global extern var declaration to become a global 
definition. And. moreover, it is not a new change from my side, I just happen 
to add a comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75402/new/

https://reviews.llvm.org/D75402



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-02 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm marked an inline comment as done.
hsmhsm added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1919
 
 void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV) {
   LLVMUsed.emplace_back(GV);

hliao wrote:
> hsmhsm wrote:
> > yaxunl wrote:
> > > hliao wrote:
> > > > This check should be removed completely instead it should be revised 
> > > > for the exceptions.
> > > How about add back this assertion but make an exception for global 
> > > variables with hip_pinned_shadow attribute.
> > Yeah, that is what I was thinking of initially - to remove this assertion 
> > only to hip_pinned_shadow vars, may, be creating a clone of this function 
> > something like - CodeGenModule::addUsedGlobalForHipPinnedShadows(), without 
> > assertion statement.
> > 
> > @hliao
> > 
> > What do you say?
> how about change that function prototype to
> 
> void addUsedGlobal(llvm::GlobalValue *GV, bool SkipCheck = false)
> 
> and assert check to
> 
> assert ((SkipCheck || !GV->isDeclaration()) && "");
> 
This also looks fine to me, and I can make these changes and submit the changes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75402/new/

https://reviews.llvm.org/D75402



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-02 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm marked an inline comment as done.
hsmhsm added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1919
 
 void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV) {
   LLVMUsed.emplace_back(GV);

yaxunl wrote:
> hliao wrote:
> > This check should be removed completely instead it should be revised for 
> > the exceptions.
> How about add back this assertion but make an exception for global variables 
> with hip_pinned_shadow attribute.
Yeah, that is what I was thinking of initially - to remove this assertion only 
to hip_pinned_shadow vars, may, be creating a clone of this function something 
like - CodeGenModule::addUsedGlobalForHipPinnedShadows(), without assertion 
statement.

@hliao

What do you say?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75402/new/

https://reviews.llvm.org/D75402



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-02-29 Thread Mahesha S via Phabricator via cfe-commits
hsmhsm created this revision.
hsmhsm added reviewers: yaxunl, tra.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

hip-pinned-shadow global var should remain in the final code object irrespective
of whether it is used or not within the code. Add it to used list, so that it
will not get eliminated when it is unused.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D75402

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/test/CodeGenCUDA/hip-pinned-shadow.cu


Index: clang/test/CodeGenCUDA/hip-pinned-shadow.cu
===
--- clang/test/CodeGenCUDA/hip-pinned-shadow.cu
+++ clang/test/CodeGenCUDA/hip-pinned-shadow.cu
@@ -4,6 +4,8 @@
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEV %s
 // RUN: %clang_cc1 -triple x86_64 -std=c++11 \
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPHOST %s
+// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -std=c++11 -fvisibility 
hidden -fapply-global-visibility-to-externs \
+// RUN: -O3 -emit-llvm -o - -x hip %s | FileCheck 
-check-prefixes=HIPDEVUNSED %s
 
 struct textureReference {
   int a;
@@ -21,3 +23,5 @@
 // HIPDEV-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  define{{.*}}@_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  call i32 @__hipRegisterVar{{.*}}@tex{{.*}}i32 0, i32 4, i32 0, 
i32 0)
+// HIPDEVUNSED: @tex = external addrspace(1) global %struct.texture
+// HIPDEVUNSED-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -1917,8 +1917,6 @@
 }
 
 void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV) {
-  assert(!GV->isDeclaration() &&
- "Only globals with definition can force usage.");
   LLVMUsed.emplace_back(GV);
 }
 
@@ -4071,9 +4069,17 @@
 }
   }
 
-  if (!IsHIPPinnedShadowVar)
+  // HIPPinnedShadowVar should remain in the final code object irrespective of
+  // whether it is used or not within the code. Add it to used list, so that
+  // it will not get eliminated when it is unused. Also, it is an extern var
+  // within device code, and it should *not* get initialized within device 
code.
+  if (IsHIPPinnedShadowVar)
+addUsedGlobal(GV);
+  else
 GV->setInitializer(Init);
-  if (emitter) emitter->finalize(GV);
+
+  if (emitter)
+emitter->finalize(GV);
 
   // If it is safe to mark the global 'constant', do so now.
   GV->setConstant(!NeedsGlobalCtor && !NeedsGlobalDtor &&


Index: clang/test/CodeGenCUDA/hip-pinned-shadow.cu
===
--- clang/test/CodeGenCUDA/hip-pinned-shadow.cu
+++ clang/test/CodeGenCUDA/hip-pinned-shadow.cu
@@ -4,6 +4,8 @@
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEV %s
 // RUN: %clang_cc1 -triple x86_64 -std=c++11 \
 // RUN: -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPHOST %s
+// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -std=c++11 -fvisibility hidden -fapply-global-visibility-to-externs \
+// RUN: -O3 -emit-llvm -o - -x hip %s | FileCheck -check-prefixes=HIPDEVUNSED %s
 
 struct textureReference {
   int a;
@@ -21,3 +23,5 @@
 // HIPDEV-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  define{{.*}}@_ZN7textureIfLi2ELi1EEC1Ev
 // HIPHOST:  call i32 @__hipRegisterVar{{.*}}@tex{{.*}}i32 0, i32 4, i32 0, i32 0)
+// HIPDEVUNSED: @tex = external addrspace(1) global %struct.texture
+// HIPDEVUNSED-NOT: declare{{.*}}void @_ZN7textureIfLi2ELi1EEC1Ev
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -1917,8 +1917,6 @@
 }
 
 void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV) {
-  assert(!GV->isDeclaration() &&
- "Only globals with definition can force usage.");
   LLVMUsed.emplace_back(GV);
 }
 
@@ -4071,9 +4069,17 @@
 }
   }
 
-  if (!IsHIPPinnedShadowVar)
+  // HIPPinnedShadowVar should remain in the final code object irrespective of
+  // whether it is used or not within the code. Add it to used list, so that
+  // it will not get eliminated when it is unused. Also, it is an extern var
+  // within device code, and it should *not* get initialized within device code.
+  if (IsHIPPinnedShadowVar)
+addUsedGlobal(GV);
+  else
 GV->setInitializer(Init);
-  if (emitter) emitter->finalize(GV);
+
+  if (emitter)
+emitter->finalize(GV);
 
   // If it is safe to mark the global 'constant', do so now.
   GV->setConstant(!NeedsGlobalCtor && !NeedsGlobalDtor &&
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits