Hi Andrew,
this patch adds support for gfx90c GCN5 APU integrated graphics devices.
The LLVM AMDGPU documentation (https://llvm.org/docs/AMDGPUUsage.html)
lists those devices as unsupported by rocm-amdhsa.
As we have discussed elsewhere, I have tested the patch on an AMD Ryzen
5 5500U (also with different xnack settings) that I have and it passes
most libgomp offloading tests.
Although those APUs are very constrainted compared to dGPUs, I think
they might be interesting for learning, experimentation, and testing.
Can I commit the patch to the master branch?
Best regards,
Frederik
From 809e2a0248e6fad1e8336b4a883a729017cc62e5 Mon Sep 17 00:00:00 2001
From: Frederik Harwath
Date: Wed, 24 Apr 2024 20:29:14 +0200
Subject: [PATCH] amdgcn: Add gfx90c target
Add support for gfx90c GCN5 APU integrated graphics devices.
The LLVM AMDGPU documentation does not list those devices as supported
by rocm-amdhsa, but it passes most libgomp offloading tests.
Although they are constrainted compared to dGPUs, they might be
interesting for learning, experimentation, and testing.
gcc/ChangeLog:
* config.gcc: Add gfx90c.
* config/gcn/gcn-hsa.h (NO_SRAM_ECC): Likewise.
* config/gcn/gcn-opts.h (enum processor_type): Likewise.
(TARGET_GFX90c): New macro.
* config/gcn/gcn.cc (gcn_option_override): Handle gfx90c.
(gcn_omp_device_kind_arch_isa): Likewise.
(output_file_start): Likewise.
* config/gcn/gcn.h: Add gfx90c.
* config/gcn/gcn.opt: Likewise.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX90c): New macro.
(get_arch): Handle gfx90c.
(main): Handle EF_AMDGPU_MACH_AMDGCN_GFX90c
* config/gcn/t-omp-device: Add gfx90c.
* doc/install.texi: Likewise.
* doc/invoke.texi: Likewise.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (isa_hsa_name): Handle EF_AMDGPU_MACH_AMDGCN_GFX90c.
(isa_code): Handle gfx90c.
(max_isa_vgprs): Handle EF_AMDGPU_MACH_AMDGCN_GFX90c.
Signed-off-by: Frederik Harwath
---
gcc/config.gcc | 4 ++--
gcc/config/gcn/gcn-hsa.h| 2 +-
gcc/config/gcn/gcn-opts.h | 2 ++
gcc/config/gcn/gcn.cc | 8
gcc/config/gcn/gcn.h| 2 ++
gcc/config/gcn/gcn.opt | 3 +++
gcc/config/gcn/mkoffload.cc | 9 +
gcc/config/gcn/t-omp-device | 2 +-
gcc/doc/install.texi| 4 ++--
gcc/doc/invoke.texi | 3 +++
libgomp/plugin/plugin-gcn.c | 9 +
11 files changed, 42 insertions(+), 6 deletions(-)
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 5df3c52f8e9..1bf07b6eece 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4569,7 +4569,7 @@ case "${target}" in
for which in arch tune; do
eval "val=\$with_$which"
case ${val} in
- "" | fiji | gfx900 | gfx906 | gfx908 | gfx90a | gfx1030 | gfx1036 | gfx1100 | gfx1103)
+ "" | fiji | gfx900 | gfx906 | gfx908 | gfx90a | gfx90c | gfx1030 | gfx1036 | gfx1100 | gfx1103)
# OK
;;
*)
@@ -4585,7 +4585,7 @@ case "${target}" in
TM_MULTILIB_CONFIG=
;;
xdefault | xyes)
- TM_MULTILIB_CONFIG=`echo "gfx900,gfx906,gfx908,gfx90a,gfx1030,gfx1036,gfx1100,gfx1103" | sed "s/${with_arch},\?//;s/,$//"`
+ TM_MULTILIB_CONFIG=`echo "gfx900,gfx906,gfx908,gfx90a,gfx90c,gfx1030,gfx1036,gfx1100,gfx1103" | sed "s/${with_arch},\?//;s/,$//"`
;;
*)
TM_MULTILIB_CONFIG="${with_multilib_list}"
diff --git a/gcc/config/gcn/gcn-hsa.h b/gcc/config/gcn/gcn-hsa.h
index 7d6e3141cea..4611bc55392 100644
--- a/gcc/config/gcn/gcn-hsa.h
+++ b/gcc/config/gcn/gcn-hsa.h
@@ -93,7 +93,7 @@ extern unsigned int gcn_local_sym_hash (const char *name);
#define NO_XNACK "march=fiji:;march=gfx1030:;march=gfx1036:;march=gfx1100:;march=gfx1103:;" \
/* These match the defaults set in gcn.cc. */ \
"!mxnack*|mxnack=default:%{march=gfx900|march=gfx906|march=gfx908:-mattr=-xnack};"
-#define NO_SRAM_ECC "!march=*:;march=fiji:;march=gfx900:;march=gfx906:;"
+#define NO_SRAM_ECC "!march=*:;march=fiji:;march=gfx900:;march=gfx906:;march=gfx90c:;"
/* In HSACOv4 no attribute setting means the binary supports "any" hardware
configuration. The name of the attribute also changed. */
diff --git a/gcc/config/gcn/gcn-opts.h b/gcc/config/gcn/gcn-opts.h
index 49099bad7e7..1091035a69a 100644
--- a/gcc/config/gcn/gcn-opts.h
+++ b/gcc/config/gcn/gcn-opts.h
@@ -25,6 +25,7 @@ enum processor_type
PROCESSOR_VEGA20, // gfx906
PROCESSOR_GFX908,
PROCESSOR_GFX90a,
+ PROCESSOR_GFX90c,
PROCESSOR_GFX1030,
PROCESSOR_GFX1036,
PROCESSOR_GFX1100,
@@ -36,6 +37,7 @@ enum processor_type
#define TARGET_VEGA20 (gcn_arch == PROCESSOR_VEGA20)
#define TARGET_GFX908 (gcn_arch == PROCESSOR_GFX908)
#define TARGET_GFX90a (gcn_arch == PROCESSOR_GFX90a)
+#define TARGET_GFX90c (gcn_arch == PROCESSOR_GFX90c)
#define TARGET_GFX1030 (gcn_arch == PROCESSOR_GFX1030)
#define TARGET_GFX1036 (gcn_arch == PROCESSOR_GFX1036)
#define TARGET_GFX1100 (gcn_arch == PROCESSOR_GFX1