[PATCH] D99235: [HIP] Change to code object v4

2021-05-18 Thread Greg Rodgers via Phabricator via cfe-commits
gregrodgers added inline comments.



Comment at: clang/lib/Driver/ToolChains/HIP.cpp:116
+  if (getOrCheckAMDGPUCodeObjectVersion(C.getDriver(), Args) >= 4)
+OffloadKind = OffloadKind + "v4";
   for (const auto &II : Inputs) {

yaxunl wrote:
> tra wrote:
> > We do not do it for v2/v3. Could you elaborate on what makes v4 special 
> > that it needs its own offload kind? 
> > 
> > Will you need to target different object versions simultaneously?
> > If yes, how? AFAICT, the version specified is currently global and applies 
> > to all sub-compilations.
> > If not, then do we really need to encode the version in the offload target 
> > name?
> Introducing hipv4 is to differentiate with code object version 2 and 3 which 
> are used by HIP applications compiled by older version of clang. ROCm 
> platform is required to keep binary backward compatibility, i.e., old HIP 
> applications built by ROCm 4.0 should run on ROCm 4.1. The bundle ID has 
> different interpretation depending on whether it is version 2/3 or version 4, 
> e.g. 'gfx906' implies xnack and sramecc off with code object v2/3 but implies 
> xnack and sramecc ANY with v4. Since code object version 2/3 uses 'hip', code 
> object version 4 needs to be different, therefore it uses 'hipv4'.
We need to start thinking in terms of offload requirements of a compiled image 
vs the capabilities of a particular active runtime on a particular GPU.   This 
concept can eliminate the need for a new offload kind.  For AMD, we would add 
the requirement of code object v4 (cov4) if built for code object v4 or 
greater.This means it can only run on a system with that capability.  This 
concept works well with requirements xnack+, xnack-, sramecc+ and sramecc-.
The bundle entry id is the offload-kind, the triple, and the list of image 
requirements.  The gpu type (offload-arch) is really an image requirement.  

In this model, there is no requirement for xnack-any.  The lack of the xnack+ 
or xnack- requirement implies "any" which means it can run on any capable 
machine.  

This is a general model that is extensible.   To make this work, a runtime must 
be able to detect the capabilities for any requirement that could be tagged on 
an image.  In fact, every requirement of an embedded image must have its 
capability detected by the runtime for that offload image to be usable.   
However, a system's runtime could have more capabilities than the requirements 
of an image.   So in the case of xnack, the lack of xnack- or xnack+ will be 
acceptable no matter what the xnack capability of the runtime is.   If the 
compiler driver puts the requirement cov4 in the bundle entry id requirements 
field the runtime will not run that image unless the GPU loader supports v4 or 
greater. 

The clang driver can create the requirement xnack- for code object < 4 on those 
GPUs that support either xnack mode.   This will ensure  the image will 
gracefully fail or use an alternative image if the runtime capability is xnack+.

But the cov4 requirement is mostly unrelated to xnack .  It is about the 
capability of the GPU loader.  If the code object version >= 4, then it will be 
tagged with the cov4 requirement.   This would prevent an old system that does 
not have a newer software stack from running an image with a cov4 requirement. 

This general notion of image requirements and runtime capabilities is 
extensible to other offload architectures.   Suppose cuda version 12 
compilation REQUIRES that a cuda version 12 runtime.   Old runtimes would never 
display cuv12 capability and would fail to run any image created with the 
requirement cuv12.
 






Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99235: [HIP] Change to code object v4

2021-04-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done.
yaxunl added inline comments.



Comment at: clang/test/Driver/hip-code-object-version.hip:24-39
 // Check bundle ID for code object v2.
 
 // RUN: %clang -### -target x86_64-linux-gnu \
 // RUN:   -mno-code-object-v3 \
 // RUN:   --offload-arch=gfx906 -nogpulib \
 // RUN:   %s 2>&1 | FileCheck -check-prefixes=V2,V2-WARN %s
 

tra wrote:
> Nit: it would be nice to move V2 tests above the V3, so the tests are in 
> order. 
sorry I missed it. will do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99235: [HIP] Change to code object v4

2021-04-06 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
yaxunl marked an inline comment as done.
Closed by commit rG4fd05e0ad7fb: [HIP] Change to code object v4 (authored by 
yaxunl).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

Files:
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIP.cpp
  clang/test/Driver/hip-code-object-version.hip
  clang/test/Driver/hip-target-id.hip
  clang/test/Driver/hip-toolchain-device-only.hip
  clang/test/Driver/hip-toolchain-no-rdc.hip
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip
  clang/test/Driver/hip-toolchain-rdc.hip
  clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -1143,6 +1143,7 @@
  .Case("host", true)
  .Case("openmp", true)
  .Case("hip", true)
+ .Case("hipv4", true)
  .Default(false);
 
 bool TripleIsValid = !Triple.empty();
Index: clang/test/Driver/hip-toolchain-rdc.hip
===
--- clang/test/Driver/hip-toolchain-rdc.hip
+++ clang/test/Driver/hip-toolchain-rdc.hip
@@ -97,7 +97,7 @@
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // CHECK-SAME: "-bundle-align=4096"
-// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // CHECK-SAME: "-inputs={{.*}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.*hipfb]]"
 
 // CHECK: [[MC:".*llvm-mc.*"]] "-o" [[OBJBUNDLE:".*o"]] "{{.*}}.mcin" "--filetype=obj"
Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -85,7 +85,7 @@
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
-// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // CHECK-SAME: "-inputs={{.*}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.*hipfb]]"
 
 // CHECK: [[MC:".*llvm-mc.*"]] "-o" [[OBJBUNDLE:".*o"]] "{{.*}}.mcin" "--filetype=obj"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -125,7 +125,7 @@
 // LINK-SAME: "-o" "[[IMG_DEV2:.*.out]]" "[[A_BC2]]" "[[B_BC2]]"
 
 // LINK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
-// LINK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// LINK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // LINK-SAME: "-inputs={{.*}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.*hipfb]]"
 
 // LINK: {{".*llvm-mc.*"}} "-o" "[[OBJBUNDLE:.*o]]" "{{.*}}.mcin" "--filetype=obj"
Index: clang/test/Driver/hip-toolchain-no-rdc.hip
===
--- clang/test/Driver/hip-toolchain-no-rdc.hip
+++ clang/test/Driver/hip-toolchain-no-rdc.hip
@@ -82,7 +82,7 @@
 
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // CHECK-SAME: "-bundle-align=4096"
-// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // CHECK-SAME: "-inputs={{.*}},[[IMG_DEV_A_803]],[[IMG_DEV_A_900]]" "-outputs=[[BUNDLE_A:.*hipfb]]"
 
 // CHECK: [[CLANG]] "-cc1" "-triple" "x86_64-unknown-linux-gnu"
@@ -145,7 +145,7 @@
 
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // CHECK-SAME: "-bundle-align=4096"
-// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // CHECK-SAME: "-inputs={{.*}},[[IMG_DEV_B_803]],[[IMG_DEV_B_900]]" "-outputs=[[BUNDLE_A:.*hipfb]]"
 
 // CHECK: [[CLANG]] "-cc1" "-triple" "x86_64-unknown-linux-gnu"
Index: clang/test/Driver/hip-toolchain-device-only.hip
===
--- clang/test/Driver

[PATCH] D99235: [HIP] Change to code object v4

2021-04-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

Still LGTM.




Comment at: clang/test/Driver/hip-code-object-version.hip:24-39
 // Check bundle ID for code object v2.
 
 // RUN: %clang -### -target x86_64-linux-gnu \
 // RUN:   -mno-code-object-v3 \
 // RUN:   --offload-arch=gfx906 -nogpulib \
 // RUN:   %s 2>&1 | FileCheck -check-prefixes=V2,V2-WARN %s
 

Nit: it would be nice to move V2 tests above the V3, so the tests are in order. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99235: [HIP] Change to code object v4

2021-04-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done.
yaxunl added inline comments.



Comment at: clang/lib/Driver/ToolChains/HIP.cpp:115
   std::string OffloadKind = "hip";
+  if (getOrCheckAMDGPUCodeObjectVersion(C.getDriver(), Args) >= 4)
+OffloadKind = OffloadKind + "v4";

tra wrote:
> Should it be an error if we pass `-mcode-object-version=99` ?
Yes we diagnose that.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99235: [HIP] Change to code object v4

2021-04-06 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision.
tra added inline comments.
This revision is now accepted and ready to land.



Comment at: clang/lib/Driver/ToolChains/HIP.cpp:115
   std::string OffloadKind = "hip";
+  if (getOrCheckAMDGPUCodeObjectVersion(C.getDriver(), Args) >= 4)
+OffloadKind = OffloadKind + "v4";

Should it be an error if we pass `-mcode-object-version=99` ?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99235: [HIP] Change to code object v4

2021-04-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done.
yaxunl added a comment.

ping.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99235: [HIP] Change to code object v4

2021-03-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done.
yaxunl added inline comments.



Comment at: clang/lib/Driver/ToolChains/HIP.cpp:116
+  if (getOrCheckAMDGPUCodeObjectVersion(C.getDriver(), Args) >= 4)
+OffloadKind = OffloadKind + "v4";
   for (const auto &II : Inputs) {

tra wrote:
> We do not do it for v2/v3. Could you elaborate on what makes v4 special that 
> it needs its own offload kind? 
> 
> Will you need to target different object versions simultaneously?
> If yes, how? AFAICT, the version specified is currently global and applies to 
> all sub-compilations.
> If not, then do we really need to encode the version in the offload target 
> name?
Introducing hipv4 is to differentiate with code object version 2 and 3 which 
are used by HIP applications compiled by older version of clang. ROCm platform 
is required to keep binary backward compatibility, i.e., old HIP applications 
built by ROCm 4.0 should run on ROCm 4.1. The bundle ID has different 
interpretation depending on whether it is version 2/3 or version 4, e.g. 
'gfx906' implies xnack and sramecc off with code object v2/3 but implies xnack 
and sramecc ANY with v4. Since code object version 2/3 uses 'hip', code object 
version 4 needs to be different, therefore it uses 'hipv4'.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99235: [HIP] Change to code object v4

2021-03-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments.



Comment at: clang/lib/Driver/ToolChains/HIP.cpp:116
+  if (getOrCheckAMDGPUCodeObjectVersion(C.getDriver(), Args) >= 4)
+OffloadKind = OffloadKind + "v4";
   for (const auto &II : Inputs) {

We do not do it for v2/v3. Could you elaborate on what makes v4 special that it 
needs its own offload kind? 

Will you need to target different object versions simultaneously?
If yes, how? AFAICT, the version specified is currently global and applies to 
all sub-compilations.
If not, then do we really need to encode the version in the offload target name?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99235/new/

https://reviews.llvm.org/D99235

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99235: [HIP] Change to code object v4

2021-03-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision.
yaxunl added a reviewer: tra.
yaxunl requested review of this revision.

Change to code object v4 by default to match ROCm 4.1.

This is upstream of code object v4 support from amd-stg-open branch.


https://reviews.llvm.org/D99235

Files:
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  clang/lib/Driver/ToolChains/HIP.cpp
  clang/test/Driver/hip-code-object-version.hip
  clang/test/Driver/hip-target-id.hip
  clang/test/Driver/hip-toolchain-device-only.hip
  clang/test/Driver/hip-toolchain-no-rdc.hip
  clang/test/Driver/hip-toolchain-rdc-separate.hip
  clang/test/Driver/hip-toolchain-rdc-static-lib.hip
  clang/test/Driver/hip-toolchain-rdc.hip
  clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

Index: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
===
--- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
+++ clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
@@ -1143,6 +1143,7 @@
  .Case("host", true)
  .Case("openmp", true)
  .Case("hip", true)
+ .Case("hipv4", true)
  .Default(false);
 
 bool TripleIsValid = !Triple.empty();
Index: clang/test/Driver/hip-toolchain-rdc.hip
===
--- clang/test/Driver/hip-toolchain-rdc.hip
+++ clang/test/Driver/hip-toolchain-rdc.hip
@@ -97,7 +97,7 @@
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // CHECK-SAME: "-bundle-align=4096"
-// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // CHECK-SAME: "-inputs={{.*}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.*hipfb]]"
 
 // CHECK: [[MC:".*llvm-mc.*"]] "-o" [[OBJBUNDLE:".*o"]] "{{.*}}.mcin" "--filetype=obj"
Index: clang/test/Driver/hip-toolchain-rdc-static-lib.hip
===
--- clang/test/Driver/hip-toolchain-rdc-static-lib.hip
+++ clang/test/Driver/hip-toolchain-rdc-static-lib.hip
@@ -85,7 +85,7 @@
 
 // combine images generated into hip fat binary object
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
-// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // CHECK-SAME: "-inputs={{.*}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.*hipfb]]"
 
 // CHECK: [[MC:".*llvm-mc.*"]] "-o" [[OBJBUNDLE:".*o"]] "{{.*}}.mcin" "--filetype=obj"
Index: clang/test/Driver/hip-toolchain-rdc-separate.hip
===
--- clang/test/Driver/hip-toolchain-rdc-separate.hip
+++ clang/test/Driver/hip-toolchain-rdc-separate.hip
@@ -125,7 +125,7 @@
 // LINK-SAME: "-o" "[[IMG_DEV2:.*.out]]" "[[A_BC2]]" "[[B_BC2]]"
 
 // LINK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
-// LINK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// LINK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // LINK-SAME: "-inputs={{.*}},[[IMG_DEV1]],[[IMG_DEV2]]" "-outputs=[[BUNDLE:.*hipfb]]"
 
 // LINK: {{".*llvm-mc.*"}} "-o" "[[OBJBUNDLE:.*o]]" "{{.*}}.mcin" "--filetype=obj"
Index: clang/test/Driver/hip-toolchain-no-rdc.hip
===
--- clang/test/Driver/hip-toolchain-no-rdc.hip
+++ clang/test/Driver/hip-toolchain-no-rdc.hip
@@ -82,7 +82,7 @@
 
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // CHECK-SAME: "-bundle-align=4096"
-// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // CHECK-SAME: "-inputs={{.*}},[[IMG_DEV_A_803]],[[IMG_DEV_A_900]]" "-outputs=[[BUNDLE_A:.*hipfb]]"
 
 // CHECK: [[CLANG]] "-cc1" "-triple" "x86_64-unknown-linux-gnu"
@@ -145,7 +145,7 @@
 
 // CHECK: [[BUNDLER:".*clang-offload-bundler"]] "-type=o"
 // CHECK-SAME: "-bundle-align=4096"
-// CHECK-SAME: "-targets={{.*}},hip-amdgcn-amd-amdhsa--gfx803,hip-amdgcn-amd-amdhsa--gfx900"
+// CHECK-SAME: "-targets={{.*}},hipv4-amdgcn-amd-amdhsa--gfx803,hipv4-amdgcn-amd-amdhsa--gfx900"
 // CHECK-SAME: "-inputs={{.*}},[[IMG_DEV_B_803]],[[IMG_DEV_B_900]]" "-outputs=[[BUNDLE_A:.*hipfb]]"
 
 // CHECK: [[CLANG]] "-cc1" "-triple" "x86_64-unknown-linux-gnu"
Index: clang/test/Driver/hip-toolchain-device-only.hip
===
--- clang/test/Driver/hip-toolchain-device-only.hip
+++ clang/test/Driver/hip-toolchain-device-only.hip
@@ -25,5 +25,5 @@
 // CHEC