Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-24 Thread Tom de Vries via Gcc-patches

On 2/22/22 17:03, Tobias Burnus wrote:

Hi Tom,

On 22.02.22 15:43, Tom de Vries wrote:

On 2/17/22 18:24, Tobias Burnus wrote:

--- a/gcc/config/nvptx/t-omp-device
+++ b/gcc/config/nvptx/t-omp-device
@@ -1,4 +1,4 @@
 echo kind: gpu > $@
 echo arch: nvptx >> $@
-    echo isa: sm_30 sm_35 >> $@
+    echo isa: sm_30 sm_35 sm_53 sm_70 sm_75 sm_80 >> $@


I'm not sure I understand how this is used.  Is this user-visible?  Is
there a libgomp test-case where we can observe a difference?


That's used for OpenMP context selectors like; that way, one can generate,
e.g. one code used with nvptx and one with gcn as with:

#pragma omp declare variant (on_nvptx) 
match(construct={target},device={arch(nvptx)})
#pragma omp declare variant (on_gcn) 
match(construct={target},device={arch(gcn)})

...
   #pragma omp target map(from:v)
   v = on ();
which then either calls 'on' or 'on_nvptx' or 'on_gcn'
(from libgomp/testsuite/libgomp.c/target-42.c)


The following testcases use 'arch(nvptx)':

libgomp/testsuite/libgomp.c-c++-common/on_device_arch.h
libgomp/testsuite/libgomp.c/target-42.c
libgomp/testsuite/libgomp.c/usleep.h
libgomp/testsuite/libgomp.fortran/declare-variant-1.f90

For ISA, there is only one run-time test:

libgomp/testsuite/libgomp.c/declare-variant-1.c

but only for x86-64: match (device={isa("avx512f")})

The sm_35 also appears, but only in the compile-time tests:
gcc/testsuite/{c-c++-common,gfortran.dg}/gomp/declare-variant-{9,10}.*



Thanks for the explanation.

I've updated the patch to include changes to 
nvptx_omp_device_kind_arch_isa, and committed.


I'll try to submit a patch with one or more test-cases.

Thanks,
- Tom

[nvptx] Add missing t-omp-device isas

In t-omp-device we list isas that can be used in omp declare variant like so:
...
  #pragma omp declare variant (f30) match (device={isa("sm_30")})
...
and in nvptx_omp_device_kind_arch_isa we handle them.

Update both to reflect the current list of isas.

Tested on x86_64-linux with nvptx accelerator.

gcc/ChangeLog:

2022-02-23  Tom de Vries  

	* config/nvptx/nvptx.cc (nvptx_omp_device_kind_arch_isa): Handle
	sm_70, sm_75 and sm_80.
	* config/nvptx/t-omp-device: Add sm_53, sm_70, sm_75 and sm_80.

Co-Authored-By: Tobias Burnus 

---
 gcc/config/nvptx/nvptx.cc | 8 +++-
 gcc/config/nvptx/t-omp-device | 2 +-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 6f6d592e462..b9451c2ed09 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -6181,7 +6181,13 @@ nvptx_omp_device_kind_arch_isa (enum omp_device_kind_arch_isa trait,
   if (strcmp (name, "sm_35") == 0)
 	return TARGET_SM35 && !TARGET_SM53;
   if (strcmp (name, "sm_53") == 0)
-	return TARGET_SM53;
+	return TARGET_SM53 && !TARGET_SM70;
+  if (strcmp (name, "sm_70") == 0)
+	return TARGET_SM70 && !TARGET_SM75;
+  if (strcmp (name, "sm_75") == 0)
+	return TARGET_SM75 && !TARGET_SM80;
+  if (strcmp (name, "sm_80") == 0)
+	return TARGET_SM80;
   return 0;
 default:
   gcc_unreachable ();
diff --git a/gcc/config/nvptx/t-omp-device b/gcc/config/nvptx/t-omp-device
index 8765d9f1881..4228218a424 100644
--- a/gcc/config/nvptx/t-omp-device
+++ b/gcc/config/nvptx/t-omp-device
@@ -1,4 +1,4 @@
 omp-device-properties-nvptx: $(srcdir)/config/nvptx/nvptx.cc
 	echo kind: gpu > $@
 	echo arch: nvptx >> $@
-	echo isa: sm_30 sm_35 >> $@
+	echo isa: sm_30 sm_35 sm_53 sm_70 sm_75 sm_80 >> $@


Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-22 Thread Tobias Burnus

Hi Tom,

On 22.02.22 15:43, Tom de Vries wrote:

On 2/17/22 18:24, Tobias Burnus wrote:

--- a/gcc/config/nvptx/t-omp-device
+++ b/gcc/config/nvptx/t-omp-device
@@ -1,4 +1,4 @@
 echo kind: gpu > $@
 echo arch: nvptx >> $@
-echo isa: sm_30 sm_35 >> $@
+echo isa: sm_30 sm_35 sm_53 sm_70 sm_75 sm_80 >> $@


I'm not sure I understand how this is used.  Is this user-visible?  Is
there a libgomp test-case where we can observe a difference?


That's used for OpenMP context selectors like; that way, one can generate,
e.g. one code used with nvptx and one with gcn as with:

#pragma omp declare variant (on_nvptx) 
match(construct={target},device={arch(nvptx)})
#pragma omp declare variant (on_gcn) 
match(construct={target},device={arch(gcn)})
...
  #pragma omp target map(from:v)
  v = on ();
which then either calls 'on' or 'on_nvptx' or 'on_gcn'
(from libgomp/testsuite/libgomp.c/target-42.c)


The following testcases use 'arch(nvptx)':

libgomp/testsuite/libgomp.c-c++-common/on_device_arch.h
libgomp/testsuite/libgomp.c/target-42.c
libgomp/testsuite/libgomp.c/usleep.h
libgomp/testsuite/libgomp.fortran/declare-variant-1.f90

For ISA, there is only one run-time test:

libgomp/testsuite/libgomp.c/declare-variant-1.c

but only for x86-64: match (device={isa("avx512f")})

The sm_35 also appears, but only in the compile-time tests:
gcc/testsuite/{c-c++-common,gfortran.dg}/gomp/declare-variant-{9,10}.*

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-22 Thread Tom de Vries via Gcc-patches

On 2/17/22 18:24, Tobias Burnus wrote:

diff --git a/gcc/config/nvptx/t-omp-device b/gcc/config/nvptx/t-omp-device
index 8765d9f1881..4228218a424 100644
--- a/gcc/config/nvptx/t-omp-device
+++ b/gcc/config/nvptx/t-omp-device
@@ -1,4 +1,4 @@
 omp-device-properties-nvptx: $(srcdir)/config/nvptx/nvptx.cc
echo kind: gpu > $@
echo arch: nvptx >> $@
-   echo isa: sm_30 sm_35 >> $@
+   echo isa: sm_30 sm_35 sm_53 sm_70 sm_75 sm_80 >> $@


I'm not sure I understand how this is used.  Is this user-visible?  Is 
there a libgomp test-case where we can observe a difference?


Thanks,
- Tom


Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-22 Thread Tom de Vries via Gcc-patches

On 2/17/22 18:24, Tobias Burnus wrote:

SM version (-misa=)
[Patch adds -misa=sm_70]

* The compiler supports internally: SM_30, SM_35, SM_53, SM_70, SM_75, 
SM_80.


I'd formulate it like: it uses SM_70 internally to accurately formulate 
when certain insns can be used.



I think it makes sense to have sm_70 in addition:
* The current code actually does generate different code for >= sm_70
   already.


Agreed.

I've committed this (with a somewhat shorter commit log), and a 
test-case update.


Thanks,
- Tomnvptx: Add -misa=sm_70

Add -misa=sm_70, and use it to specify the misa value in test-case
gcc.target/nvptx/atomic-store-2.c.

Tested on nvptx.

gcc/ChangeLog:

	* config/nvptx/nvptx-c.cc (nvptx_cpu_cpp_builtins): Handle SM70.
	* config/nvptx/nvptx.cc (first_ptx_version_supporting_sm):
	Likewise.
	* config/nvptx/nvptx.opt (misa): Add sm_70 alias PTX_ISA_SM70.

gcc/testsuite/ChangeLog:

2022-02-22  Tom de Vries  

	* gcc.target/nvptx/atomic-store-2.c: Use -misa=sm_70.
	* gcc.target/nvptx/uniform-simt-3.c: Same.

Co-Authored-By: Tom de Vries 

---
 gcc/config/nvptx/nvptx-c.cc | 2 ++
 gcc/config/nvptx/nvptx.cc   | 2 ++
 gcc/config/nvptx/nvptx.opt  | 3 +++
 gcc/testsuite/gcc.target/nvptx/atomic-store-2.c | 2 +-
 gcc/testsuite/gcc.target/nvptx/uniform-simt-3.c | 2 +-
 5 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/config/nvptx/nvptx-c.cc b/gcc/config/nvptx/nvptx-c.cc
index d68b9910d7e..b2375fb5b16 100644
--- a/gcc/config/nvptx/nvptx-c.cc
+++ b/gcc/config/nvptx/nvptx-c.cc
@@ -43,6 +43,8 @@ nvptx_cpu_cpp_builtins (void)
 cpp_define (parse_in, "__PTX_SM__=800");
   else if (TARGET_SM75)
 cpp_define (parse_in, "__PTX_SM__=750");
+  else if (TARGET_SM70)
+cpp_define (parse_in, "__PTX_SM__=700");
   else if (TARGET_SM53)
 cpp_define (parse_in, "__PTX_SM__=530");
   else if (TARGET_SM35)
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 981b91f7095..858789e6df7 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -217,6 +217,8 @@ first_ptx_version_supporting_sm (enum ptx_isa sm)
   return PTX_VERSION_3_1;
 case PTX_ISA_SM53:
   return PTX_VERSION_4_2;
+case PTX_ISA_SM70:
+  return PTX_VERSION_6_0;
 case PTX_ISA_SM75:
   return PTX_VERSION_6_3;
 case PTX_ISA_SM80:
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index 97e127cc4fb..9776c3b9a1f 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -64,6 +64,9 @@ Enum(ptx_isa) String(sm_35) Value(PTX_ISA_SM35)
 EnumValue
 Enum(ptx_isa) String(sm_53) Value(PTX_ISA_SM53)
 
+EnumValue
+Enum(ptx_isa) String(sm_70) Value(PTX_ISA_SM70)
+
 EnumValue
 Enum(ptx_isa) String(sm_75) Value(PTX_ISA_SM75)
 
diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c b/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c
index cd5e4c38267..b58f33f2abd 100644
--- a/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c
@@ -2,7 +2,7 @@
shared state space.  */
 
 /* { dg-do compile } */
-/* { dg-options "-misa=sm_75" } */
+/* { dg-options "-misa=sm_70" } */
 
 enum memmodel
 {
diff --git a/gcc/testsuite/gcc.target/nvptx/uniform-simt-3.c b/gcc/testsuite/gcc.target/nvptx/uniform-simt-3.c
index 532fa825161..b61b8ba9d5b 100644
--- a/gcc/testsuite/gcc.target/nvptx/uniform-simt-3.c
+++ b/gcc/testsuite/gcc.target/nvptx/uniform-simt-3.c
@@ -1,4 +1,4 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -muniform-simt -misa=sm_75" } */
+/* { dg-options "-O2 -muniform-simt -misa=sm_70" } */
 
 #include "atomic-store-2.c"


Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-22 Thread Tom de Vries via Gcc-patches

On 2/17/22 18:24, Tobias Burnus wrote:

PTX version (-mptx=)
[patch adds -mptx=6.0 as option]

* Currently supported internally are 3.1 (CUDA 5.0, used by GCC <= 11),
   6.0 (CUDA 9.0, current GCC 12 default), 6.3 (CUDA 10.0), 7.0 (CUDA 11.0)
* -mptx= supports 3.1, 6.3, 7.0 – but not the internal default 6.0



I tend not to think in terms of CUDA versions, but supported driver 
versions.


In the end, drivers are used to translate ptx to SASS for execution, 
CUDA is just used for build time verification (or not, if it's not in 
the path).


And a driver may or may not be supported.  F.i. 390.x still may receive 
updates from nvidia, but there are JIT bugs that we've reported that 
they've decided not to fix, so from that point of view 390.x is unsupported.



I think it makes sense to expose the 6.0 value to the user and not
only use it internally behind the scenes. As it is already used internally,
the change is tiny but user visible. 


Sure, I've committed this (with a somewhat shorter commit log).


Thus, it has to stay when we will
bump the default in later GCC versions; on the other hand, if we bump
the default, it might be also a good reason to have it to permit the
user to have a backward compatible PTX output for linking libraries.



FWIW, I think that it's possible to link different versions of ptx isa 
together (though perhaps there are specific scenarios where that's not 
possible, I'm not sure).  But mixing versions restricts the range of 
drivers you can use, so it may make sense to just use one version.


Thanks,
- Tomnvptx: Add -mptx=6.0

Currently supported internally are 3.1, 6.0, 6.3 and 7.0.

However, -mptx= supports 3.1, 6.3, 7.0 – but not the internal default 6.0.

Add -mptx=6.0 for consistency.

Tested on nvptx.

gcc/ChangeLog:

	* config/nvptx/nvptx.opt (mptx): Add 6.0 alias PTX_VERSION_6_0.
	* doc/invoke.texi (-mptx): Update for new values and defaults.

Co-Authored-By: Tom de Vries 

---
 gcc/config/nvptx/nvptx.opt | 3 +++
 gcc/doc/invoke.texi| 7 ---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index e56ec9288da..97e127cc4fb 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -82,6 +82,9 @@ Known PTX versions (for use with the -mptx= option):
 EnumValue
 Enum(ptx_version) String(3.1) Value(PTX_VERSION_3_1)
 
+EnumValue
+Enum(ptx_version) String(6.0) Value(PTX_VERSION_6_0)
+
 EnumValue
 Enum(ptx_version) String(6.3) Value(PTX_VERSION_6_3)
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 635c5f79278..56f3a01de44 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -27286,9 +27286,10 @@ strings must be lower-case.  Valid ISA strings include @samp{sm_30} and
 
 @item -mptx=@var{version-string}
 @opindex mptx
-Generate code for given the specified PTX version (e.g.@: @samp{6.3}).
-Valid version strings include @samp{3.1} and @samp{6.3}.  The default PTX
-version is 3.1.
+Generate code for given the specified PTX version (e.g.@: @samp{7.0}).
+Valid version strings include @samp{3.1}, @samp{6.0}, @samp{6.3}, and
+@samp{7.0}.  The default PTX version is 6.0, unless a higher minimal
+version is required for specified PTX ISA via option @option{-misa=}.
 
 @item -mmainkernel
 @opindex mmainkernel


[Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-17 Thread Tobias Burnus

This patch exposes two -m* option values which are already
internally available. I think it makes sense to expose them
explicitly to the user (see below), but there are also arguments
against. Thoughts?


PTX version (-mptx=)
[patch adds -mptx=6.0 as option]

* Currently supported internally are 3.1 (CUDA 5.0, used by GCC <= 11),
  6.0 (CUDA 9.0, current GCC 12 default), 6.3 (CUDA 10.0), 7.0 (CUDA 11.0)
* -mptx= supports 3.1, 6.3, 7.0 – but not the internal default 6.0

First, I think all versions make sense:
* 3.1 is the previous default and permits running with older CUDA (if need)
* 6.0 is for CUDA 9 - and if we want to support it, it has to stay.
  6.0 is the default since commit
  https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590007.html
* 6.3 is CUDA 10.0. In that PTX version, a lot of nice features
  were added like .alias
* 7.0 is CUDA 11.0. This adds support for sm_80 (honored in code gen).

PTX >= 6.0 makes sense as it permits newer sm_* (in particular: sm_53 and sm_70)
and
+  /* Pick at least 6.0, to enable using bar.warp.sync to have a way to force
+ warp convergence.  */
On the other hand, for older systems, CUDA 10.0 might be too new and we still
want to support CUDA 9. (At least that's how I understood one of nvpx gcc
emails, which I cannot find at the moment.)

Assuming we don't want to change the default minimal version from PTX 6.0
back to 6.3, it looks as both should stay.
Downside: we probably need one lib{c,gomp,gfortran,...} per PTX version,
i.e. 4 versions (3.1, 6.0, 6.3, 7.0).

I think it makes sense to expose the 6.0 value to the user and not
only use it internally behind the scenes. As it is already used internally,
the change is tiny but user visible. Thus, it has to stay when we will
bump the default in later GCC versions; on the other hand, if we bump
the default, it might be also a good reason to have it to permit the
user to have a backward compatible PTX output for linking libraries.

 * * *

SM version (-misa=)
[Patch adds -misa=sm_70]

* The compiler supports internally: SM_30, SM_35, SM_53, SM_70, SM_75, SM_80.
* GCC <= 11 only had sm_30 and sm_35 (supported since PTX 3.1/CUDA 5.0)
* GCC 12 exposes
  - sm_30, sm_35,
  - sm_53 (PTX 4.2, CUDA 7.0),
  - sm_75 (PTX 6.3, CUDA 10.0)
  - sm_80 (PTX 7.0, CUDA 11.0)
  but it does not permit using -misa=sm_70 (PTX 6.0, CUDA 9.0).
* Note: sm_75 + sm_80 imply a newer PTX version, which
  the compiler defaults to (if no -mptx= has been specified).

I think it makes sense to have sm_70 in addition:
* sm_70 enables several new features (see PTX documentation)
* sm_70 is the highest supported for CUDA 9 (default PTX version);
  as sm_75 will require CUDA 10, currently only sm_53 can be used with CUDA 9.
* The current code actually does generate different code for >= sm_70
  already.

 * * *

This patch updates -misa= and -mptx= documentation to match what actually has
been implemented. I think that makes sense as:
* The currently documented default for -mptx= is no longer true.
* The available values are already exposed via the diagnostic
* The multilib issue already occurs when the user explicitly specifies -mptx=6.3
  (or -mptx=3.1).
* If needed, we could note that certain PTX or ISA values are experimental.

I think besides > sm_35 being experimental, there is no reason that higher sm_*
should not be used. Except for the pre-existing multilib issue and for the ICE
when bootstrapping with sm_53 (instead of sm_35) as default ISA version.
But that's solved by Roger's patch (pending ME (and then BE) review),
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590545.html

* * *

Comments to any of those three patches (-mptx=6.0, -misa=sm_70, documentation)?
(Lightly tested on x86-64 with nvptx offloading.)
OK? (All, some?)

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
nvptx: Add -mptx=6.0 + -misa=sm_70

gcc/ChangeLog:

	* config/nvptx/nvptx-c.cc (nvptx_cpu_cpp_builtins): Handle SM70.
	* gcc/config/nvptx/nvptx.cc (first_ptx_version_supporting_sm):
	Likewise.
	* config/nvptx/nvptx.opt (misa): Add sm_70 alias PTX_ISA_SM70.
	(mptx): Add 6.0 alias PTX_VERSION_6_0.
	* config/nvptx/t-omp-device: Add sm_53, sm_70, sm_75, sm_80.
	* doc/invoke.texi (-misa, -mptx): Update for new values and
	defaults.

 gcc/config/nvptx/nvptx-c.cc   |  2 ++
 gcc/config/nvptx/nvptx.cc |  2 ++
 gcc/config/nvptx/nvptx.opt|  6 ++
 gcc/config/nvptx/t-omp-device |  2 +-
 gcc/doc/invoke.texi   | 17 +++--
 5 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/gcc/config/nvptx/nvptx-c.cc b/gcc/config/nvptx/nvptx-c.cc
index d68b9910d7e..b2375fb5b16 100644
--- a/gcc/config/nvptx/nvptx-c.cc
+++ b/gcc/config/nvptx/nvptx-c.cc
@@ -43,6 +43,8 @@ nvptx_cpu_cpp_builtins (void)
 cpp_define