[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-20 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:67
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());

arsenm wrote:
> gandhi21299 wrote:
> > arsenm wrote:
> > > I thought aliases could include embedded bitcasts of the function type, 
> > > so the function wouldn't directly appear here
> > Can you please elaborate on "include embedded bitcasts of the function 
> > type"? It's a consequence of the AlwaysInliner where the callee gets 
> > replaced by the alias to a function, ie. @func_alias gets replaced by @func 
> > in the inline-calls.ll test.
> Something like this where the alias changes the type from the original 
> function:
> 
> 
> ```
> @add1alias3 = alias float (float), bitcast (i32 (i32)* @add1 to float(float)*)
> ```
> 
I see, that will probably break the compiler since a bitcast expression is not 
a Function.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:67
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());

gandhi21299 wrote:
> arsenm wrote:
> > I thought aliases could include embedded bitcasts of the function type, so 
> > the function wouldn't directly appear here
> Can you please elaborate on "include embedded bitcasts of the function type"? 
> It's a consequence of the AlwaysInliner where the callee gets replaced by the 
> alias to a function, ie. @func_alias gets replaced by @func in the 
> inline-calls.ll test.
Something like this where the alias changes the type from the original function:


```
@add1alias3 = alias float (float), bitcast (i32 (i32)* @add1 to float(float)*)
```



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-20 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:67
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());

arsenm wrote:
> I thought aliases could include embedded bitcasts of the function type, so 
> the function wouldn't directly appear here
Can you please elaborate on "include embedded bitcasts of the function type"? 
It's a consequence of the AlwaysInliner where the callee gets replaced by the 
alias to a function, ie. @func_alias gets replaced by @func in the 
inline-calls.ll test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:67
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());

I thought aliases could include embedded bitcasts of the function type, so the 
function wouldn't directly appear here


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-19 Thread Anshil Gandhi via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG0567f0333176: [HIP] [AlwaysInliner] Disable AlwaysInliner to 
eliminate undefined symbols (authored by gandhi21299).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll

Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,6 @@
-; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
+; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
+; RUN: llc -mtriple r600-unknown-linux-gnu -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s --check-prefix=R600
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -9,7 +9,7 @@
   ret i32 %tmp0
 }
 
-; ALL: {{^}}kernel:
+; CHECK: {{^}}kernel:
 ; GCN-NOT: s_swappc_b64
 define amdgpu_kernel void @kernel(i32 addrspace(1)* %out) {
 entry:
@@ -18,12 +18,13 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; R600-NOT: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
-; ALL: {{^}}kernel3:
+; CHECK-NOT: {{^}}kernel3:
 ; GCN-NOT: s_swappc_b64
+; R600: {{^}}kernel3:
 define amdgpu_kernel void @kernel3(i32 addrspace(1)* %out) {
 entry:
   %tmp0 = call i32 @func_alias(i32 1)
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/GlobalAlias.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -61,7 +63,8 @@
 assert(Op.getImm() == 0);
 return nullptr;
   }
-
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());
 }
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -15,6 +15,7 @@
 #include "AMDGPU.h"
 #include "AMDGPUTargetMachine.h"
 #include "Utils/AMDGPUBaseInfo.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/Module.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/CommandLine.h"
@@ -90,9 +91,13 @@
 
   SmallPtrSet FuncsToAlwaysInline;
   SmallPtrSet FuncsToNoInline;
+  Triple TT(M.getTargetTriple());
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (TT.getArch() == Triple::amdgcn &&
+  A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang -target x86_64-unknown-linux-gnu --offload-arch=gfx906 --cuda-device-only -nogpulib -nogpuinc -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5089,9 +5089,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.is

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-17 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added inline comments.



Comment at: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu:3
+
+// RUN: %clang -target x86_64-unknown-linux-gnu --offload-arch=gfx906 
--cuda-device-only -nogpulib -nogpuinc -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \

MaskRay wrote:
> non-driver tests prefer `%clang_cc1`.
> 
> `%clang` invokes the driver and has varying behaviors on different platforms. 
> Include paths/resource dir may be quite different.
Alias is not generated when I make the change to:

```
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx906 -aux-triple 
x86_64-unknown-linux-gnu \
// RUN:   -x hip -fcuda-is-device -fgpu-rdc -O3 -mllvm 
-amdgpu-early-inline-all=true \
// RUN:   -mllvm -amdgpu-function-calls=false -emit-llvm %s -o - | FileCheck %s
```



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-17 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments.



Comment at: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu:3
+
+// RUN: %clang -target x86_64-unknown-linux-gnu --offload-arch=gfx906 
--cuda-device-only -nogpulib -nogpuinc -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \

non-driver tests prefer `%clang_cc1`.

`%clang` invokes the driver and has varying behaviors on different platforms. 
Include paths/resource dir may be quite different.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-17 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

@thakis can you please check if this solution is sufficient? Thanks for 
bringing it up


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-17 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 380110.
gandhi21299 added a comment.

added -target option in the test amdgpu-alias-undef-symbols.cu


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll

Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,6 @@
-; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
+; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
+; RUN: llc -mtriple r600-unknown-linux-gnu -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s --check-prefix=R600
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -9,7 +9,7 @@
   ret i32 %tmp0
 }
 
-; ALL: {{^}}kernel:
+; CHECK: {{^}}kernel:
 ; GCN-NOT: s_swappc_b64
 define amdgpu_kernel void @kernel(i32 addrspace(1)* %out) {
 entry:
@@ -18,12 +18,13 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; R600-NOT: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
-; ALL: {{^}}kernel3:
+; CHECK-NOT: {{^}}kernel3:
 ; GCN-NOT: s_swappc_b64
+; R600: {{^}}kernel3:
 define amdgpu_kernel void @kernel3(i32 addrspace(1)* %out) {
 entry:
   %tmp0 = call i32 @func_alias(i32 1)
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/GlobalAlias.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -61,7 +63,8 @@
 assert(Op.getImm() == 0);
 return nullptr;
   }
-
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());
 }
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -15,6 +15,7 @@
 #include "AMDGPU.h"
 #include "AMDGPUTargetMachine.h"
 #include "Utils/AMDGPUBaseInfo.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/Module.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/CommandLine.h"
@@ -90,9 +91,13 @@
 
   SmallPtrSet FuncsToAlwaysInline;
   SmallPtrSet FuncsToNoInline;
+  Triple TT(M.getTargetTriple());
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (TT.getArch() == Triple::amdgcn &&
+  A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang -target x86_64-unknown-linux-gnu --offload-arch=gfx906 --cuda-device-only -nogpulib -nogpuinc -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5089,9 +5089,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mcons

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

@gandhi21299 you may need to add "-target x86_64-unknown-linux-gnu" to your 
codegen test to avoid issue with Darwin toolchain.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-15 Thread Nico Weber via Phabricator via cfe-commits
thakis added a comment.

This breaks tests on Mac: http://45.33.8.238/mac/37119/step_7.txt

Please take a look and revert for now if it takes a while to fix.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-15 Thread Anshil Gandhi via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG03375a3fb33b: [HIP] [AlwaysInliner] Disable AlwaysInliner to 
eliminate undefined symbols (authored by gandhi21299).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll

Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,6 @@
-; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
+; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
+; RUN: llc -mtriple r600-unknown-linux-gnu -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s --check-prefix=R600
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -9,7 +9,7 @@
   ret i32 %tmp0
 }
 
-; ALL: {{^}}kernel:
+; CHECK: {{^}}kernel:
 ; GCN-NOT: s_swappc_b64
 define amdgpu_kernel void @kernel(i32 addrspace(1)* %out) {
 entry:
@@ -18,12 +18,13 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; R600-NOT: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
-; ALL: {{^}}kernel3:
+; CHECK-NOT: {{^}}kernel3:
 ; GCN-NOT: s_swappc_b64
+; R600: {{^}}kernel3:
 define amdgpu_kernel void @kernel3(i32 addrspace(1)* %out) {
 entry:
   %tmp0 = call i32 @func_alias(i32 1)
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/GlobalAlias.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -61,7 +63,8 @@
 assert(Op.getImm() == 0);
 return nullptr;
   }
-
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());
 }
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -15,6 +15,7 @@
 #include "AMDGPU.h"
 #include "AMDGPUTargetMachine.h"
 #include "Utils/AMDGPUBaseInfo.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/Module.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/CommandLine.h"
@@ -90,9 +91,13 @@
 
   SmallPtrSet FuncsToAlwaysInline;
   SmallPtrSet FuncsToNoInline;
+  Triple TT(M.getTargetTriple());
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (TT.getArch() == Triple::amdgcn &&
+  A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -nogpulib -nogpuinc -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5089,9 +5089,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX(

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision.
yaxunl added a comment.
This revision is now accepted and ready to land.

LGTM. Thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-15 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-14 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

Passed ePSDB


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-14 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 379530.
gandhi21299 added a comment.

add a restrictions to what architecture AlwaysInliner should run on, updated 
the inline-calls.ll test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll

Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,6 @@
-; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
+; RUN: llc -mtriple amdgcn-unknown-linux-gnu -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
+; RUN: llc -mtriple r600-unknown-linux-gnu -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s --check-prefix=R600
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -9,7 +9,7 @@
   ret i32 %tmp0
 }
 
-; ALL: {{^}}kernel:
+; CHECK: {{^}}kernel:
 ; GCN-NOT: s_swappc_b64
 define amdgpu_kernel void @kernel(i32 addrspace(1)* %out) {
 entry:
@@ -18,12 +18,13 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; R600-NOT: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
-; ALL: {{^}}kernel3:
+; CHECK-NOT: {{^}}kernel3:
 ; GCN-NOT: s_swappc_b64
+; R600: {{^}}kernel3:
 define amdgpu_kernel void @kernel3(i32 addrspace(1)* %out) {
 entry:
   %tmp0 = call i32 @func_alias(i32 1)
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/GlobalAlias.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -61,7 +63,8 @@
 assert(Op.getImm() == 0);
 return nullptr;
   }
-
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());
 }
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -15,6 +15,7 @@
 #include "AMDGPU.h"
 #include "AMDGPUTargetMachine.h"
 #include "Utils/AMDGPUBaseInfo.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/Module.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/CommandLine.h"
@@ -90,9 +91,13 @@
 
   SmallPtrSet FuncsToAlwaysInline;
   SmallPtrSet FuncsToNoInline;
+  Triple TT(M.getTargetTriple());
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (TT.getArch() == Triple::amdgcn &&
+  A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -nogpulib -nogpuinc -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5109,9 +5109,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-12 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added inline comments.



Comment at: llvm/test/CodeGen/AMDGPU/inline-calls.ll:3
 ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 

@tstellar  Is there a way to restrict the AlwaysInliner to only run on amdgcn 
architecture?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-08 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

Passed internal CI


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-08 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 378218.
gandhi21299 added a comment.

added -nogpulib and -nogpuinc flags to amdgpu-alias-undef-symbols.cu


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,5 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  
%s
 ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +17,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/GlobalAlias.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -61,7 +63,8 @@
 assert(Op.getImm() == 0);
 return nullptr;
   }
-
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());
 }
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -nogpulib -nogpuinc -x 
hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5103,9 +5103,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,5 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
 ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +17,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineF

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-08 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-08 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 377922.
gandhi21299 added a comment.

refreshing patch


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,5 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  
%s
 ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +17,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/GlobalAlias.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -61,7 +63,8 @@
 assert(Op.getImm() == 0);
 return nullptr;
   }
-
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());
 }
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S 
-o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5102,9 +5102,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,5 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
 ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +17,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/Cod

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-05 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-01 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 376559.
gandhi21299 added a comment.

- Since callees may alias to a function pointer, it makes sense for 
`getCalleeFunction(...)` to return a `Function` which is a cast of the operand 
of a `GlobalAlias`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll

Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,5 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
 ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +17,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/SIISelLowering.cpp
===
--- llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3007,6 +3007,7 @@
   bool IsSibCall = false;
   bool IsThisReturn = false;
   MachineFunction &MF = DAG.getMachineFunction();
+  GlobalAddressSDNode *GSD = dyn_cast(Callee);
 
   if (Callee.isUndef() || isNullConstant(Callee)) {
 if (!CLI.IsTailCall) {
@@ -3264,7 +3265,7 @@
   Ops.push_back(Callee);
   // Add a redundant copy of the callee global which will not be legalized, as
   // we need direct access to the callee later.
-  if (GlobalAddressSDNode *GSD = dyn_cast(Callee)) {
+  if (GSD) {
 const GlobalValue *GV = GSD->getGlobal();
 Ops.push_back(DAG.getTargetGlobalAddress(GV, DL, MVT::i64));
   } else {
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/GlobalAlias.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -61,7 +63,8 @@
 assert(Op.getImm() == 0);
 return nullptr;
   }
-
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());
 }
 
Index: llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
@@ -913,14 +913,17 @@
   if (Info.Callee.isReg()) {
 CallInst.addReg(Info.Callee.getReg());
 CallInst.addImm(0);
-  } else if (Info.Callee.isGlobal() && Info.Callee.getOffset() == 0) {
-// The call lowering lightly assumed we can directly encode a call target in
-// the instruction, which is not the case. Materialize the address here.
+  } else if (Info.Callee.isGlobal()) {
 const GlobalValue *GV = Info.Callee.getGlobal();
-auto Ptr = MIRBuilder.buildGlobalValue(
-  LLT::pointer(GV->getAddressSpace(), 64), GV);
-CallInst.addReg(Ptr.getReg(0));
-CallInst.add(Info.Callee);
+if (Info.Callee.getOffset() == 0) {
+  // The call lowering lightly assumed we can directly encode a call target
+  // in the instruction, which is not the case. Materialize the address
+  // here.
+  auto Ptr = MIRBuilder.buildGlobalValue(
+  LLT::pointer(GV->getAddressSpace(), 64), GV);
+  CallInst.addReg(Ptr.getReg(0));
+  CallInst.add(Info.Callee);
+}
   } else
 return false;
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// RE

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-10-01 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 376564.
gandhi21299 added a comment.

- eliminated changes in SIISelLowering


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,5 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  
%s
 ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +17,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/GlobalAlias.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -61,7 +63,8 @@
 assert(Op.getImm() == 0);
 return nullptr;
   }
-
+  if (auto *GA = dyn_cast(Op.getGlobal()))
+return cast(GA->getOperand(0));
   return cast(Op.getGlobal());
 }
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S 
-o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5102,9 +5102,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,5 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
 ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +17,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -29,6 +29,8 @@
 #include "SIMachineFunctionInfo.h"
 #include "llvm/Analysis/CallGraph.

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-30 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

inline-calls.ll failed on gfx908 due to the change in SIISelLowering.cpp, line 
3015. Without the change, there is a failure in AMDGPUResourceAnalysis.cpp, 
line 65 because Op.getGlobal() is not a Function.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-30 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 reclaimed this revision.
gandhi21299 added a comment.

Sorry, that was a mistake.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-29 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp:96-97
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);

yaxunl wrote:
> If we do this for older GPU's (e.g. Tonga/redwood), IR's using aliases will 
> fail on them. I don't think it is acceptable.
> 
> Is it possible to restrict this change to gfx9 and above? Or should we 
> introduce some feature to indicate 'alias support' and use that to restrict 
> this change to subtargets supporting this feature.
Restricting this change to gfx9 and above sounds simpler and more relevant with 
the problem as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments.



Comment at: llvm/test/CodeGen/AMDGPU/inline-calls.ll:1
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  
%s
 

need to add check for gfx906 and gfx1030


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp:96-97
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);

If we do this for older GPU's (e.g. Tonga/redwood), IR's using aliases will 
fail on them. I don't think it is acceptable.

Is it possible to restrict this change to gfx9 and above? Or should we 
introduce some feature to indicate 'alias support' and use that to restrict 
this change to subtargets supporting this feature.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-29 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 375655.
gandhi21299 added a comment.

- declare failure when lowering an accessor of a callee which is not a 
function, in GlobalISel


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll

Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,4 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +16,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/SIISelLowering.cpp
===
--- llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3007,6 +3007,13 @@
   bool IsSibCall = false;
   bool IsThisReturn = false;
   MachineFunction &MF = DAG.getMachineFunction();
+  GlobalAddressSDNode *GSD = dyn_cast(Callee);
+
+  if (GSD) {
+const GlobalValue *GV = GSD->getGlobal();
+if (!isa(GV))
+  return lowerUnhandledCall(CLI, InVals, "callee is not a function ");
+  }
 
   if (Callee.isUndef() || isNullConstant(Callee)) {
 if (!CLI.IsTailCall) {
@@ -3264,7 +3271,7 @@
   Ops.push_back(Callee);
   // Add a redundant copy of the callee global which will not be legalized, as
   // we need direct access to the callee later.
-  if (GlobalAddressSDNode *GSD = dyn_cast(Callee)) {
+  if (GSD) {
 const GlobalValue *GV = GSD->getGlobal();
 Ops.push_back(DAG.getTargetGlobalAddress(GV, DL, MVT::i64));
   } else {
Index: llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
@@ -913,14 +913,19 @@
   if (Info.Callee.isReg()) {
 CallInst.addReg(Info.Callee.getReg());
 CallInst.addImm(0);
-  } else if (Info.Callee.isGlobal() && Info.Callee.getOffset() == 0) {
-// The call lowering lightly assumed we can directly encode a call target in
-// the instruction, which is not the case. Materialize the address here.
+  } else if (Info.Callee.isGlobal()) {
 const GlobalValue *GV = Info.Callee.getGlobal();
-auto Ptr = MIRBuilder.buildGlobalValue(
-  LLT::pointer(GV->getAddressSpace(), 64), GV);
-CallInst.addReg(Ptr.getReg(0));
-CallInst.add(Info.Callee);
+if (!isa(GV))
+  return false;
+if (Info.Callee.getOffset() == 0) {
+  // The call lowering lightly assumed we can directly encode a call target
+  // in the instruction, which is not the case. Materialize the address
+  // here.
+  auto Ptr = MIRBuilder.buildGlobalValue(
+  LLT::pointer(GV->getAddressSpace(), 64), GV);
+  CallInst.addReg(Ptr.getReg(0));
+  CallInst.add(Info.Callee);
+}
   } else
 return false;
 
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-27 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

@yaxunl Should inline-calls.ll be converted into an expected failing test or 
removed? (to avoid cast failure in AMDGPUResourceAnalysis to break the test)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-27 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 375284.
gandhi21299 marked an inline comment as done.
gandhi21299 added a comment.

- added the `REQUIRES` line as requested by Sam


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll

Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,4 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +16,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/SIISelLowering.cpp
===
--- llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3007,6 +3007,13 @@
   bool IsSibCall = false;
   bool IsThisReturn = false;
   MachineFunction &MF = DAG.getMachineFunction();
+  GlobalAddressSDNode *GSD = dyn_cast(Callee);
+
+  if (GSD) {
+const GlobalValue *GV = GSD->getGlobal();
+if (!isa(GV))
+  return lowerUnhandledCall(CLI, InVals, "callee is not a function ");
+  }
 
   if (Callee.isUndef() || isNullConstant(Callee)) {
 if (!CLI.IsTailCall) {
@@ -3264,7 +3271,7 @@
   Ops.push_back(Callee);
   // Add a redundant copy of the callee global which will not be legalized, as
   // we need direct access to the callee later.
-  if (GlobalAddressSDNode *GSD = dyn_cast(Callee)) {
+  if (GSD) {
 const GlobalValue *GV = GSD->getGlobal();
 Ops.push_back(DAG.getTargetGlobalAddress(GV, DL, MVT::i64));
   } else {
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target, clang-driver
+
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5107,9 +5107,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-27 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

In D109707#3016438 , @gandhi21299 
wrote:

> - replaced a `cast` with a `dyn_cast` since the return value from 
> `getCalleeFunction()` is not always a Function
> - `RUN on line 2` was causing 2 more scalar registers to be used on tonga due 
> to @func_alias not being inlined, hence I eliminated that test
> - `RUN on line 3` generated a call instruction to an aliased function which 
> is not supported on r600 (according to @arsenm ), hence I eliminated that 
> test as well

@yaxunl


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

Pls update the description of the patch and also make sure it passes internal 
CI.




Comment at: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu:1
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S 
-o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \

this test needs

// REQUIRES: amdgpu-registered-target, clang-driver



Comment at: llvm/test/CodeGen/AMDGPU/inline-calls.ll:2-3
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  
%s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 

why these two lines are removed?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-27 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

It does not look like function calls are supported yet in AMDGPUCallLowering, 
is that correct?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-27 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 375079.
gandhi21299 added a comment.

- Declare an unhandled call lowering in SelectionDAG when a callee is 
encountered which cannot be casted into a Function
- I am still investigating the effects on GlobalISel side of things, there 
seems to be a problem when lowering a call to `@func` in `@kernel` as well.
- inline-calls.ll is expected to fail with this patch, we could turn it into a 
negative test depending on how the work goes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,4 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  
%s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +16,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/SIISelLowering.cpp
===
--- llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3007,6 +3007,13 @@
   bool IsSibCall = false;
   bool IsThisReturn = false;
   MachineFunction &MF = DAG.getMachineFunction();
+  GlobalAddressSDNode *GSD = dyn_cast(Callee);
+
+  if (GSD) {
+const GlobalValue *GV = GSD->getGlobal();
+if (!isa(GV))
+  return lowerUnhandledCall(CLI, InVals, "callee is not a function ");
+  }
 
   if (Callee.isUndef() || isNullConstant(Callee)) {
 if (!CLI.IsTailCall) {
@@ -3264,7 +3271,7 @@
   Ops.push_back(Callee);
   // Add a redundant copy of the callee global which will not be legalized, as
   // we need direct access to the callee later.
-  if (GlobalAddressSDNode *GSD = dyn_cast(Callee)) {
+  if (GSD) {
 const GlobalValue *GV = GSD->getGlobal();
 Ops.push_back(DAG.getTargetGlobalAddress(GV, DL, MVT::i64));
   } else {
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,15 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S 
-o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,4 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-N

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:65
 
-  return cast(Op.getGlobal());
+  return dyn_cast(Op.getGlobal());
 }

gandhi21299 wrote:
> arsenm wrote:
> > I think this is not the right place for this. If we can determine the 
> > callee function, we should have directly set it in the instruction during 
> > call lowering
> Which file would that be in?
SIISelLowering and AMDGPUCallLowering


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-23 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:65
 
-  return cast(Op.getGlobal());
+  return dyn_cast(Op.getGlobal());
 }

arsenm wrote:
> I think this is not the right place for this. If we can determine the callee 
> function, we should have directly set it in the instruction during call 
> lowering
Which file would that be in?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:65
 
-  return cast(Op.getGlobal());
+  return dyn_cast(Op.getGlobal());
 }

I think this is not the right place for this. If we can determine the callee 
function, we should have directly set it in the instruction during call lowering


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-23 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 374369.
gandhi21299 added a comment.

- refreshing patch


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,4 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  
%s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +16,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -62,7 +62,7 @@
 return nullptr;
   }
 
-  return cast(Op.getGlobal());
+  return dyn_cast(Op.getGlobal());
 }
 
 static bool hasAnyNonFlatUseOfReg(const MachineRegisterInfo &MRI,
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,15 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S 
-o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,4 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +16,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -62,7 +62,7 @@
 return nullptr;
   }
 
-  return cast(Op.getGlobal());
+  return dyn_cast(Op.getGlobal());
 }
 
 static bool hasAnyNonFlatUseOfReg(const MachineRegisterInfo &MRI,
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-22 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 374354.
gandhi21299 added a comment.

- replaced a `cast` with a `dyn_cast` since the return value from 
`getCalleeFunction()` is not always a Function
- `RUN on line 2` was causing 2 more scalar registers to be used on tonga due 
to @func_alias not being inlined, hence I eliminated that test
- `RUN on line 3` generated a call instruction to an aliased function which is 
not supported on r600 (according to @arsenm ), hence I eliminated that test as 
well


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
  llvm/test/CodeGen/AMDGPU/inline-calls.ll


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,4 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  
%s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +16,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -62,7 +62,7 @@
 return nullptr;
   }
 
-  return cast(Op.getGlobal());
+  return dyn_cast(Op.getGlobal());
 }
 
 static bool hasAnyNonFlatUseOfReg(const MachineRegisterInfo &MRI,
Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (A.getLinkage() != GlobalValue::InternalLinkage)
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,15 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S 
-o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/test/CodeGen/AMDGPU/inline-calls.ll
===
--- llvm/test/CodeGen/AMDGPU/inline-calls.ll
+++ llvm/test/CodeGen/AMDGPU/inline-calls.ll
@@ -1,6 +1,4 @@
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck  %s
-; RUN: llc -march=r600 -mcpu=redwood -verify-machineinstrs < %s | FileCheck %s
 
 ; ALL-NOT: {{^}}func:
 define internal i32 @func(i32 %a) {
@@ -18,8 +16,8 @@
   ret void
 }
 
-; CHECK-NOT: func_alias
-; ALL-NOT: func_alias
+; CHECK: func_alias
+; ALL: func_alias
 @func_alias = alias i32 (i32), i32 (i32)* @func
 
 ; ALL: {{^}}kernel3:
Index: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -62,7 +62,7 @@
 return nullptr;
 

[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-17 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 373257.
gandhi21299 marked an inline comment as done.
gandhi21299 added a comment.

- Prevent removing alias if the GlobalAlias does not have internal linkage


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp


Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (!A.hasInternalLinkage())
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,16 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S 
-o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp
@@ -93,6 +93,8 @@
 
   for (GlobalAlias &A : M.aliases()) {
 if (Function* F = dyn_cast(A.getAliasee())) {
+  if (!A.hasInternalLinkage())
+continue;
   A.replaceAllUsesWith(F);
   AliasesToRemove.push_back(&A);
 }
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,16 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D109707#3004869 , @gandhi21299 
wrote:

> Internal linkage detection works great for our purposes but it causes a 
> failure in llvm/test/CodeGen/AMDGPU/inline-calls.ll due to `@func_alias` 
> unable to be casted into a `Function`. If we pass through that, the 
> `@kernel3` causes the error: `scalar registers (98) exceeds limit (96) in 
> function 'kernel3'`.

That almost sounds like using the wrong subtarget for the alias




Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:694
 if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
+  PM.addPass(AMDGPUAlwaysInlinePass(false));
   });

This needs a backend test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-17 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

@yaxunl I think we have two ways to go from here:

1. If appropriate, reset the maximum number of scalar registers allowed in 
`@kernel3` (inline-calls.ll) to fix the test.
2. Determine a stronger condition for inlining.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-16 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

Internal linkage detection works great for our purposes but it causes a failure 
in llvm/test/CodeGen/AMDGPU/inline-calls.ll due to `@func_alias` unable to be 
casted into a `Function`. If we pass through that, the `@kernel3` causes the 
error: `scalar registers (98) exceeds limit (96) in function 'kernel3'`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D109707#3004108 , @gandhi21299 
wrote:

> @yaxunl Under what criteria should an alias not be removed?

If the linkage of the alias is not internal, then it should not be removed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-16 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added a comment.

@yaxunl Under what criteria should an alias not be removed?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

I think you may try fixing the following line:

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp#L97


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-15 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 372573.
gandhi21299 added a comment.

- converted the HIP test into a CUDA test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -691,7 +691,7 @@
   PM.addPass(GlobalDCEPass());
 }
 if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
+  PM.addPass(AMDGPUAlwaysInlinePass(false));
   });
 
   PB.registerCGSCCOptimizerLateEPCallback(
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,16 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S 
-o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -691,7 +691,7 @@
   PM.addPass(GlobalDCEPass());
 }
 if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
+  PM.addPass(AMDGPUAlwaysInlinePass(false));
   });
 
   PB.registerCGSCCOptimizerLateEPCallback(
Index: clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/amdgpu-alias-undef-symbols.cu
@@ -0,0 +1,16 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -x hip -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-15 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 372551.
gandhi21299 added a comment.

- added the include header for HIP runtime


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -691,7 +691,7 @@
   PM.addPass(GlobalDCEPass());
 }
 if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
+  PM.addPass(AMDGPUAlwaysInlinePass(false));
   });
 
   PB.registerCGSCCOptimizerLateEPCallback(
Index: clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
===
--- /dev/null
+++ clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
@@ -0,0 +1,16 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "hip/hip_runtime.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -691,7 +691,7 @@
   PM.addPass(GlobalDCEPass());
 }
 if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
+  PM.addPass(AMDGPUAlwaysInlinePass(false));
   });
 
   PB.registerCGSCCOptimizerLateEPCallback(
Index: clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
===
--- /dev/null
+++ clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
@@ -0,0 +1,16 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+#include "hip/hip_runtime.h"
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-14 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5069
   // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
 CmdArgs.push_back("-mconstructor-aliases");

gandhi21299 wrote:
> arsenm wrote:
> > This looks like an unrelated change?
> Ahh yes, I will get rid of it.
This was part of a revert that is required for this patch to function.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-14 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 updated this revision to Diff 372528.
gandhi21299 marked an inline comment as done.
gandhi21299 added a comment.

- set `GlobalOpt` parameter to false by default to disallow alias elimination 
when the options EarlyInlineAll and EnableFunctionCalls are true and false, 
respectively.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -691,7 +691,7 @@
   PM.addPass(GlobalDCEPass());
 }
 if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
+  PM.addPass(AMDGPUAlwaysInlinePass(false));
   });
 
   PB.registerCGSCCOptimizerLateEPCallback(
Index: clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
===
--- /dev/null
+++ clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
@@ -0,0 +1,14 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -691,7 +691,7 @@
   PM.addPass(GlobalDCEPass());
 }
 if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
+  PM.addPass(AMDGPUAlwaysInlinePass(false));
   });
 
   PB.registerCGSCCOptimizerLateEPCallback(
Index: clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
===
--- /dev/null
+++ clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
@@ -0,0 +1,14 @@
+// RUN: %clang --offload-arch=gfx906 --cuda-device-only -emit-llvm -S -o - %s \
+// RUN:   -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false | \
+// RUN:   FileCheck %s
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5084,9 +5084,9 @@
   }
 
   // Enable -mconstructor-aliases except on darwin, where we have to work around
-  // a linker bug (see ), and CUDA/AMDGPU device code,
-  // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  // a linker bug (see ), and CUDA device code, where
+  // aliases aren't supported.
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-13 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 marked an inline comment as done.
gandhi21299 added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5069
   // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
 CmdArgs.push_back("-mconstructor-aliases");

arsenm wrote:
> This looks like an unrelated change?
Ahh yes, I will get rid of it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D109707#2998095 , @aeubanks wrote:

> should the `amdgpu-early-inline-all` flag be deleted?

No. It is still used by HIP. We will deprecate it in the feature, but it is not 
ready yet.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-13 Thread Arthur Eubanks via Phabricator via cfe-commits
aeubanks added a comment.

should the `amdgpu-early-inline-all` flag be deleted?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

We cannot disable early inline all since this will cause performance 
regressions. Instead, the early inline all pass should be fixed so it does not 
remove aliases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5069
   // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
 CmdArgs.push_back("-mconstructor-aliases");

This looks like an unrelated change?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

While I think the early inliner is largely obsolete, it should still handle 
aliases correctly


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109707/new/

https://reviews.llvm.org/D109707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109707: [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols

2021-09-13 Thread Anshil Gandhi via Phabricator via cfe-commits
gandhi21299 created this revision.
gandhi21299 added reviewers: yaxunl, aeubanks.
Herald added subscribers: foad, kerbowa, hiraditya, tpr, nhaehnle, jvesely, 
arsenm.
gandhi21299 requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

By default clang emits complete contructors as alias of base constructors if 
they are the same. The backend is supposed to emit symbols for the alias, 
otherwise it causes undefined symbols. @yaxunl observed that this issue is 
related to the llvm options `-amdgpu-early-inline-all=true` and 
`-amdgpu-function-calls=false`. Disabling the AlwaysInliner fixes this issue.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D109707

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -689,8 +689,6 @@
 if (InternalizeSymbols) {
   PM.addPass(GlobalDCEPass());
 }
-if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
   });
 
   PB.registerCGSCCOptimizerLateEPCallback(
Index: clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
===
--- /dev/null
+++ clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
@@ -0,0 +1,14 @@
+// RUN: %clang -c --offload-arch=gfx906 --cuda-device-only -emit-llvm -S -o - 
%s
+//  -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm 
-amdgpu-function-calls=false
+//  FileCheck %s
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), 
void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5066,7 +5066,7 @@
   // Enable -mconstructor-aliases except on darwin, where we have to work 
around
   // a linker bug (see ), and CUDA/AMDGPU device code,
   // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we


Index: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
===
--- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -689,8 +689,6 @@
 if (InternalizeSymbols) {
   PM.addPass(GlobalDCEPass());
 }
-if (EarlyInlineAll && !EnableFunctionCalls)
-  PM.addPass(AMDGPUAlwaysInlinePass());
   });
 
   PB.registerCGSCCOptimizerLateEPCallback(
Index: clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
===
--- /dev/null
+++ clang/test/CodeGen/amdgpu-alias-undef-symbols.hip
@@ -0,0 +1,14 @@
+// RUN: %clang -c --offload-arch=gfx906 --cuda-device-only -emit-llvm -S -o - %s
+//  -fgpu-rdc -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false
+//  FileCheck %s
+
+// CHECK: %struct.B = type { i8 }
+struct B {
+
+  // CHECK: @_ZN1BC1Ei = hidden unnamed_addr alias void (%struct.B*, i32), void (%struct.B*, i32)* @_ZN1BC2Ei
+  __device__ B(int x);
+};
+
+__device__ B::B(int x) {
+
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5066,7 +5066,7 @@
   // Enable -mconstructor-aliases except on darwin, where we have to work around
   // a linker bug (see ), and CUDA/AMDGPU device code,
   // where aliases aren't supported.
-  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX() && !RawTriple.isAMDGPU())
+  if (!RawTriple.isOSDarwin() && !RawTriple.isNVPTX())
 CmdArgs.push_back("-mconstructor-aliases");
 
   // Darwin's kernel doesn't support guard variables; just die if we
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits