[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-11-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 558095.
arsenm added a comment.

Drop bitcode auto upgrade handling


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

Files:
  clang/lib/CodeGen/Targets/AMDGPU.cpp
  clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel-linking.cl
  clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/IR/CMakeLists.txt
  llvm/lib/Target/AMDGPU/AMDGPU.h
  llvm/lib/Target/AMDGPU/AMDGPUExportKernelRuntimeHandles.cpp
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
  llvm/lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  llvm/lib/Target/AMDGPU/CMakeLists.txt
  llvm/test/CodeGen/AMDGPU/amdgpu-export-kernel-runtime-handles.ll
  llvm/test/CodeGen/AMDGPU/enqueue-kernel.ll
  llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
  llvm/test/CodeGen/AMDGPU/llc-pipeline.ll

Index: llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
===
--- llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -37,7 +37,7 @@
 ; GCN-O0-NEXT:Dominator Tree Construction
 ; GCN-O0-NEXT:Basic Alias Analysis (stateless AA impl)
 ; GCN-O0-NEXT:Function Alias Analysis Results
-; GCN-O0-NEXT:Lower OpenCL enqueued blocks
+; GCN-O0-NEXT:Externalize enqueued block runtime handles
 ; GCN-O0-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O0-NEXT:FunctionPass Manager
 ; GCN-O0-NEXT:  Expand Atomic instructions
@@ -178,7 +178,7 @@
 ; GCN-O1-NEXT:Dominator Tree Construction
 ; GCN-O1-NEXT:Basic Alias Analysis (stateless AA impl)
 ; GCN-O1-NEXT:Function Alias Analysis Results
-; GCN-O1-NEXT:Lower OpenCL enqueued blocks
+; GCN-O1-NEXT:Externalize enqueued block runtime handles
 ; GCN-O1-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O1-NEXT:AMDGPU Attributor
 ; GCN-O1-NEXT:  FunctionPass Manager
@@ -445,7 +445,7 @@
 ; GCN-O1-OPTS-NEXT:Dominator Tree Construction
 ; GCN-O1-OPTS-NEXT:Basic Alias Analysis (stateless AA impl)
 ; GCN-O1-OPTS-NEXT:Function Alias Analysis Results
-; GCN-O1-OPTS-NEXT:Lower OpenCL enqueued blocks
+; GCN-O1-OPTS-NEXT:Externalize enqueued block runtime handles
 ; GCN-O1-OPTS-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O1-OPTS-NEXT:AMDGPU Attributor
 ; GCN-O1-OPTS-NEXT:  FunctionPass Manager
@@ -736,7 +736,7 @@
 ; GCN-O2-NEXT:Dominator Tree Construction
 ; GCN-O2-NEXT:Basic Alias Analysis (stateless AA impl)
 ; GCN-O2-NEXT:Function Alias Analysis Results
-; GCN-O2-NEXT:Lower OpenCL enqueued blocks
+; GCN-O2-NEXT:Externalize enqueued block runtime handles
 ; GCN-O2-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O2-NEXT:AMDGPU Attributor
 ; GCN-O2-NEXT:  FunctionPass Manager
@@ -1037,7 +1037,7 @@
 ; GCN-O3-NEXT:Dominator Tree Construction
 ; GCN-O3-NEXT:Basic Alias Analysis (stateless AA impl)
 ; GCN-O3-NEXT:Function Alias Analysis Results
-; GCN-O3-NEXT:Lower OpenCL enqueued blocks
+; GCN-O3-NEXT:Externalize enqueued block runtime handles
 ; GCN-O3-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O3-NEXT:AMDGPU Attributor
 ; GCN-O3-NEXT:  FunctionPass Manager
Index: llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
===
--- llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
+++ llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
@@ -14,7 +14,8 @@
 %struct.B = type { ptr addrspace(1) }
 %opencl.clk_event_t = type opaque
 
-@__test_block_invoke_kernel_runtime_handle = external addrspace(1) externally_initialized constant ptr addrspace(1)
+@__test_block_invoke_kernel_runtime_handle = external addrspace(1) externally_initialized constant ptr addrspace(1), section ".amdgpu.kernel.runtime.handle"
+@not.a.handle = external addrspace(1) externally_initialized constant ptr addrspace(1)
 
 ; CHECK:  ---
 ; CHECK-NEXT: amdhsa.kernels:
@@ -1678,7 +1679,7 @@
 ; CHECK:  .name:   __test_block_invoke_kernel
 ; CHECK:  .symbol: __test_block_invoke_kernel.kd
 define amdgpu_kernel void @__test_block_invoke_kernel(
-<{ i32, i32, ptr, ptr addrspace(1), i8 }> %arg) #1
+<{ i32, i32, ptr, ptr addrspace(1), i8 }> %arg) #1 !associated !112
 !kernel_arg_addr_space !1 !kernel_arg_access_qual !2 !kernel_arg_type !110
 !kernel_arg_base_type !110 !kernel_arg_type_qual !4 {
   ret void
@@ -1734,6 +1735,29 @@
   ret void
 }
 
+; Make sure the device_enqueue_symbol is not reported
+; CHECK: - .args:   []
+; CHECK-NEXT: .group_segment_fixed_size: 0
+; CHECK-NEXT: 

[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-11-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/lib/IR/CMakeLists.txt:84
   Demangle
+  TransformUtils
+

This introduces a circular dependency between LLVMCore and TransformUtils. 
Options are:

1. Move appendToUsed into Module
2. Don't bother with bitcode compatibility for this
3. Avoid depending on llvm.used. I know I tried to do this but it was so long 
ago I don't remember how I ended up on this solution 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-07-07 Thread Sergei Barannikov via Phabricator via cfe-commits
barannikov88 added inline comments.



Comment at: clang/lib/CodeGen/Targets/AMDGPU.cpp:520
+static llvm::StructType *getAMDGPUKernelDescriptorType(llvm::LLVMContext ) {
+  llvm::Type *Int8 = llvm::IntegerType::getInt8Ty(C);
+  llvm::Type *Int16 = llvm::IntegerType::getInt16Ty(C);

Minor suggestion: you can get these types from CGF / CGM (Int8Ty etc.)



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-04-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp:299
+
+  Attrs.mRuntimeHandle = getEnqueuedBlockSymbolName(TM, Func);
 }

kzhuravl wrote:
> Do we really need/want to update code object v2?
as long as the code is here yes. Not updating it would mean maintaining two 
paths in the implementation. This is just changing the internal representation 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-04-04 Thread Konstantin Zhuravlyov via Phabricator via cfe-commits
kzhuravl added a comment.

Overall looks good.




Comment at: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp:299
+
+  Attrs.mRuntimeHandle = getEnqueuedBlockSymbolName(TM, Func);
 }

Do we really need/want to update code object v2?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-02-05 Thread Sameer Sahasrabuddhe via Phabricator via cfe-commits
sameerds added a comment.

LGTM, to the extent that I can see that the change does what is advertised, and 
the ultimately emitted HSA metadata preserves the current contract with the 
runtime.

A couple of tests can use a little more explanatory comments as noted.




Comment at: clang/lib/CodeGen/TargetInfo.cpp:12581
+  Mod, HandleTy,
+  /*isConstant=*/true, llvm::GlobalValue::InternalLinkage,
+  /*Initializer=*/RuntimeHandleInitializer, RuntimeHandleName,

jmmartinez wrote:
> Just a cosmetical remark: Is there any reason to keep the `/*isConstant=*/`, 
> `/*Initializer=*/`, ... comments? I think it would be better to avoid them.
FWIW, I find these comments very helpful when spelunking through code. I could 
sympathise with not needing `Initializer=` because the value name makes it 
clear. But an undecorated constant literal like "true" or "10" or "nullptr" 
works best when accompanied by a comment.



Comment at: llvm/test/Bitcode/amdgpu-autoupgrade-enqueued-block.ll:69
+
+; __enqueue_kernel* functions may get inlined
+define amdgpu_kernel void @inlined_caller(ptr addrspace(1) %a, i8 %b, ptr 
addrspace(1) %c, i64 %d) {

I did not understand what is being tested here.



Comment at: llvm/test/CodeGen/AMDGPU/amdgpu-export-kernel-runtime-handles.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --function-signature --check-attributes --check-globals
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -amdgpu-export-kernel-runtime-handles 
< %s | FileCheck %s
+

Is there any visible effect of the pass being tested? Or the intention is 
simply to check that the output is the same as input, and there is no error?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-01-16 Thread Juan Manuel Martinez CaamaƱo via Phabricator via cfe-commits
jmmartinez added inline comments.



Comment at: clang/lib/CodeGen/TargetInfo.cpp:12581
+  Mod, HandleTy,
+  /*isConstant=*/true, llvm::GlobalValue::InternalLinkage,
+  /*Initializer=*/RuntimeHandleInitializer, RuntimeHandleName,

Just a cosmetical remark: Is there any reason to keep the `/*isConstant=*/`, 
`/*Initializer=*/`, ... comments? I think it would be better to avoid them.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-01-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 489394.
arsenm added a comment.

Rename


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel-linking.cl
  clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/IR/CMakeLists.txt
  llvm/lib/Target/AMDGPU/AMDGPU.h
  llvm/lib/Target/AMDGPU/AMDGPUExportKernelRuntimeHandles.cpp
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
  llvm/lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  llvm/lib/Target/AMDGPU/CMakeLists.txt
  llvm/test/Bitcode/amdgpu-autoupgrade-enqueued-block.ll
  llvm/test/CodeGen/AMDGPU/amdgpu-export-kernel-runtime-handles.ll
  llvm/test/CodeGen/AMDGPU/enqueue-kernel.ll
  llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
  llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
  llvm/test/CodeGen/AMDGPU/llc-pipeline.ll

Index: llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
===
--- llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -41,7 +41,7 @@
 ; GCN-O0-NEXT:Call Graph SCC Pass Manager
 ; GCN-O0-NEXT:  Inliner for always_inline functions
 ; GCN-O0-NEXT:A No-Op Barrier Pass
-; GCN-O0-NEXT:Lower OpenCL enqueued blocks
+; GCN-O0-NEXT:Externalize enqueued block runtime handles
 ; GCN-O0-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O0-NEXT:FunctionPass Manager
 ; GCN-O0-NEXT:  Expand Atomic instructions
@@ -186,7 +186,7 @@
 ; GCN-O1-NEXT:Call Graph SCC Pass Manager
 ; GCN-O1-NEXT:  Inliner for always_inline functions
 ; GCN-O1-NEXT:A No-Op Barrier Pass
-; GCN-O1-NEXT:Lower OpenCL enqueued blocks
+; GCN-O1-NEXT:Externalize enqueued block runtime handles
 ; GCN-O1-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O1-NEXT:FunctionPass Manager
 ; GCN-O1-NEXT:  Infer address spaces
@@ -454,7 +454,7 @@
 ; GCN-O1-OPTS-NEXT:Call Graph SCC Pass Manager
 ; GCN-O1-OPTS-NEXT:  Inliner for always_inline functions
 ; GCN-O1-OPTS-NEXT:A No-Op Barrier Pass
-; GCN-O1-OPTS-NEXT:Lower OpenCL enqueued blocks
+; GCN-O1-OPTS-NEXT:Externalize enqueued block runtime handles
 ; GCN-O1-OPTS-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O1-OPTS-NEXT:FunctionPass Manager
 ; GCN-O1-OPTS-NEXT:  Infer address spaces
@@ -754,7 +754,7 @@
 ; GCN-O2-NEXT:Call Graph SCC Pass Manager
 ; GCN-O2-NEXT:  Inliner for always_inline functions
 ; GCN-O2-NEXT:A No-Op Barrier Pass
-; GCN-O2-NEXT:Lower OpenCL enqueued blocks
+; GCN-O2-NEXT:Externalize enqueued block runtime handles
 ; GCN-O2-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O2-NEXT:FunctionPass Manager
 ; GCN-O2-NEXT:  Infer address spaces
@@ -1057,7 +1057,7 @@
 ; GCN-O3-NEXT:Call Graph SCC Pass Manager
 ; GCN-O3-NEXT:  Inliner for always_inline functions
 ; GCN-O3-NEXT:A No-Op Barrier Pass
-; GCN-O3-NEXT:Lower OpenCL enqueued blocks
+; GCN-O3-NEXT:Externalize enqueued block runtime handles
 ; GCN-O3-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O3-NEXT:FunctionPass Manager
 ; GCN-O3-NEXT:  Infer address spaces
Index: llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
===
--- llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
+++ llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
@@ -14,7 +14,8 @@
 %struct.B = type { ptr addrspace(1)}
 %opencl.clk_event_t = type opaque
 
-@__test_block_invoke_kernel_runtime_handle = external addrspace(1) externally_initialized constant ptr addrspace(1)
+@__test_block_invoke_kernel_runtime_handle = external addrspace(1) externally_initialized constant ptr addrspace(1), section ".amdgpu.kernel.runtime.handle"
+@not.a.handle = external addrspace(1) externally_initialized constant ptr addrspace(1)
 
 ; CHECK: ---
 ; CHECK:  Version: [ 1, 0 ]
@@ -1808,7 +1809,7 @@
 ; CHECK-NEXT:   ValueKind: HiddenMultiGridSyncArg
 ; CHECK-NEXT:   AddrSpaceQual: Global
 define amdgpu_kernel void @__test_block_invoke_kernel(
-<{ i32, i32, ptr, ptr addrspace(1), i8 }> %arg) #1
+<{ i32, i32, ptr, ptr addrspace(1), i8 }> %arg) #1 !associated !112
 !kernel_arg_addr_space !1 !kernel_arg_access_qual !2 !kernel_arg_type !110
 !kernel_arg_base_type !110 !kernel_arg_type_qual !4 {
   ret void
@@ -1866,9 +1867,30 @@
   ret void
 }
 
+; Make sure the RuntimeHandle is not reported
+; CHECK: - Name:associated_global_not_handle
+; CHECK-NEXT: SymbolName:  'associated_global_not_handle@kd'
+; CHECK-NEXT: Language:OpenCL C
+; 

[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-01-15 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia added inline comments.



Comment at: clang/lib/CodeGen/TargetInfo.cpp:12440
+/// AMDHSAKernelDescriptor.h)
+static llvm::StructType *getKernelDescriptorType(llvm::LLVMContext ) {
+  llvm::Type *Int8 = llvm::IntegerType::getInt8Ty(C);

Is this AMDGPU target specific? If so perhaps it's better to reflect this in 
the name.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141700/new/

https://reviews.llvm.org/D141700

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-01-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision.
arsenm added reviewers: yaxunl, t-tye, b-sumner, rampitec, AMDGPU, Anastasia, 
JonChesterfield, jhuber6.
Herald added subscribers: kosarev, foad, kerbowa, hiraditya, tpr, dstuttard, 
jvesely, kzhuravl.
Herald added a project: All.
arsenm requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

The previous implementation wasn't maintaining a faithful IR
representation of how this really works. The value returned by
createEnqueuedBlockKernel wasn't actually used as a function, and
hacked up later to be a pointer to the runtime handle global
variable. In reality, the enqueued block is a struct where the first
field is a pointer to the kernel descriptor, not the kernel itself. We
were also relying on passing around a reference to a global using a
string attribute containing its name. It's better to base this on a
proper IR symbol reference during final emission.

  

This now avoids using a function attribute on kernels and avoids using
the additional "runtime-handle" attribute to populate the final
metadata. Instead, associate the runtime handle reference to the
kernel with the !associated global metadata. We can then get a final,
correctly mangled name at the end.

  

I couldn't figure out how to get rename-with-external-symbol behavior
using a combination of comdats and aliases, so leaves an IR pass to
 externalize the runtime handles for codegen. If anything breaks, it's
most likely this, so leave avoiding this for a later step. Use a
special section name to enable this behavior. This also means it's
possible to declare enqueuable kernels in source without going through
the dedicated block syntax or other dedicated compiler support.

  

We could move towards initializing the runtime handle in the
compiler/linker. I have a working patch where the linker sets up the
first field of the handle, avoiding the need to export the block
kernel symbol for the runtime. We would need new relocations to get
the private and group sizes, but that would avoid the runtime's
special case handling that requires the device_enqueue_symbol metadata
field.

  

Handle autoupgrade from the old kernel attribute. Not sure where I
could put the code shared with clang (maybe could rename
AMDGPUEmitPrintf to AMDGPUUtils).


https://reviews.llvm.org/D141700

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel-linking.cl
  clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/IR/CMakeLists.txt
  llvm/lib/Target/AMDGPU/AMDGPU.h
  llvm/lib/Target/AMDGPU/AMDGPUExportKernelRuntimeHandles.cpp
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
  llvm/lib/Target/AMDGPU/AMDGPUOpenCLEnqueuedBlockLowering.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  llvm/lib/Target/AMDGPU/CMakeLists.txt
  llvm/test/Bitcode/amdgpu-autoupgrade-enqueued-block.ll
  llvm/test/CodeGen/AMDGPU/amdgpu-export-kernel-runtime-handles.ll
  llvm/test/CodeGen/AMDGPU/enqueue-kernel.ll
  llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full-v3.ll
  llvm/test/CodeGen/AMDGPU/hsa-metadata-from-llvm-ir-full.ll
  llvm/test/CodeGen/AMDGPU/llc-pipeline.ll

Index: llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
===
--- llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -41,7 +41,7 @@
 ; GCN-O0-NEXT:Call Graph SCC Pass Manager
 ; GCN-O0-NEXT:  Inliner for always_inline functions
 ; GCN-O0-NEXT:A No-Op Barrier Pass
-; GCN-O0-NEXT:Lower OpenCL enqueued blocks
+; GCN-O0-NEXT:Externalize enqueued block runtime handles
 ; GCN-O0-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O0-NEXT:FunctionPass Manager
 ; GCN-O0-NEXT:  Expand Atomic instructions
@@ -186,7 +186,7 @@
 ; GCN-O1-NEXT:Call Graph SCC Pass Manager
 ; GCN-O1-NEXT:  Inliner for always_inline functions
 ; GCN-O1-NEXT:A No-Op Barrier Pass
-; GCN-O1-NEXT:Lower OpenCL enqueued blocks
+; GCN-O1-NEXT:Externalize enqueued block runtime handles
 ; GCN-O1-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O1-NEXT:FunctionPass Manager
 ; GCN-O1-NEXT:  Infer address spaces
@@ -454,7 +454,7 @@
 ; GCN-O1-OPTS-NEXT:Call Graph SCC Pass Manager
 ; GCN-O1-OPTS-NEXT:  Inliner for always_inline functions
 ; GCN-O1-OPTS-NEXT:A No-Op Barrier Pass
-; GCN-O1-OPTS-NEXT:Lower OpenCL enqueued blocks
+; GCN-O1-OPTS-NEXT:Externalize enqueued block runtime handles
 ; GCN-O1-OPTS-NEXT:Lower uses of LDS variables from non-kernel functions
 ; GCN-O1-OPTS-NEXT:FunctionPass Manager
 ; GCN-O1-OPTS-NEXT:  Infer address spaces
@@ -754,7 +754,7 @@
 ; GCN-O2-NEXT:Call Graph SCC Pass Manager
 ; GCN-O2-NEXT:  Inliner for always_inline functions
 ; GCN-O2-NEXT:A No-Op Barrier Pass
-;