On 3/30/22 11:02, Tobias Burnus wrote:
On 30.03.22 10:03, Tom de Vries wrote:
On 3/29/22 16:47, Tobias Burnus wrote:
I think it would be useful to have additionally some wording for the
(new in GCC 12/new since today) macros,
[...]
The macro is defined also if the option is not specified, so I think
this formulation is not 100% clear in that aspect. I've reformulated
to fix that.
Fine. (It was a copy, paste + modify from elsewhere.)
Also, I took out the detail of how the value is determined, since
we're just following __CUDA_ARCH__ rather than defining our own policy.
OK. While I am not sure that it is obvious, also the example makes clear
what value to expect. Combining the two, I concur that the details
aren't required.
Any comments?
LGTM.
Tobias
PS: Regarding the sm_30 -> sm_35 change (before in this email thread).
That was not meant to be in the the .texi file, but just as item to
remember when updating the wwwdocs / gcc-12/changes.html document.
I see, I misunderstood then. FWIW, it's already added to the version in
my sandbox.
It was/is also not completely clear to me whether there is still this
CUDA 11.x issue of not supporting sm_30 (only sm_35 and higher) or not.
Thanks for reminding me of this issue.
I assume it still exists but is mitigated at
compiler-usage/libgomp-runtime-usage time as PTX ISA now defaults to 6.0
such that CUDA – but shouldn't it still see sm_30 instead of sm_35 in
this case?
If so, I think it will still show up when using either explicitly PTX
ISA 3.1 or when building GCC itself and all of the following holds:
nvptx-tools is installed, CUDA (in a too new version) is installed
(ptxas in $PATH) , and the the pending pull request nvptx-tools has not
been applied that ignores the non-explicit '--verify' when .target sm_xx
or PTX ISA .version is not supported by ptxas.
I don't think the 6.0 default has any influence (and I'll be using
-mptx=3.1 below to make sure we run into the worst-case behaviour).
Anyway, in absence of an nvptx-tools fix I committed a work-around in
the compiler:
...
#define ASM_SPEC "%{misa=*:-m %*; :-m sm_35}%{misa=sm_30:--no-verify}"
...
Note that this was before reverting back the default to sm_30, and I
probably forgot to update this spot when changing the default.
So now, there are effectively two workarounds in place.
This (implicitly using sm_30) passes:
...
$ ( PATH=$PATH:~/cuda/11.6.0/bin; ./gcc.sh ~/hello.c -c -save-temps
-Wa,--verify -mptx=3.1 )
...
because as we can see with -v, sm_35 is used to verify:
...
./build-gcc/gcc/as -m sm_35 --verify -o hello.o hello.s
...
This (explicitly using sm_30) passes:
...
$ ( PATH=$PATH:~/cuda/11.6.0/bin; ./gcc.sh ~/hello.c -c -save-temps
-march=sm_30 -mptx=3.1 )
...
because as we can see with -v, the --no-verify workaround is triggered:
...
./build-gcc/gcc/as -m sm_30 --no-verify -o hello.o hello.s
...
But that one stops working once we use an explicit -Wa,--verify:
...
$ ( PATH=$PATH:~/cuda/11.6.0/bin; ./gcc.sh ~/hello.c -c -save-temps
-Wa,--verify -march=sm_30 -mptx=3.1 )
ptxas fatal : Value 'sm_30' is not defined for option 'gpu-name'
nvptx-as: ptxas returned 255 exit status
...
So, it seems using sm_35 to verify sm_30 is the most robust workaround.
I'm currently testing attached patch.
Thanks,
- Tom
[nvptx] Fix ASM_SPEC workaround for sm_30
Newer versions of CUDA no longer support sm_30, and nvptx-tools as
currently doesn't handle that gracefully when verifying
( https://github.com/MentorEmbedded/nvptx-tools/issues/30 ).
There's a --no-verify work-around in place in ASM_SPEC, but that one doesn't
work when using -Wa,--verify on the command line.
Use a more robust workaround: verify using sm_35 when misa=sm_30 is specified
(either implicitly or explicitly).
Tested on nvptx.
gcc/ChangeLog:
2022-03-30 Tom de Vries <tdevr...@suse.de>
* config/nvptx/nvptx.h (ASM_SPEC): Use "-m sm_35" for -misa=sm_30.
---
gcc/config/nvptx/nvptx.h | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h
index 75ac7a666b13..3b06f33032fd 100644
--- a/gcc/config/nvptx/nvptx.h
+++ b/gcc/config/nvptx/nvptx.h
@@ -29,10 +29,24 @@
#define STARTFILE_SPEC "%{mmainkernel:crt0.o}"
-/* Default needs to be in sync with default for misa in nvptx.opt.
- We add a default here to work around a hard-coded sm_30 default in
- nvptx-as. */
-#define ASM_SPEC "%{misa=*:-m %*; :-m sm_35}%{misa=sm_30:--no-verify}"
+/* Newer versions of CUDA no longer support sm_30, and nvptx-tools as
+ currently doesn't handle that gracefully when verifying
+ ( https://github.com/MentorEmbedded/nvptx-tools/issues/30 ). Work around
+ this by verifying with sm_35 when having misa=sm_30 (either implicitly
+ or explicitly). */
+#define ASM_SPEC \
+ "%{" \
+ /* Explict misa=sm_30. */ \
+ "misa=sm_30:-m sm_35" \
+ /* Separator. */ \
+ "; " \
+ /* Catch-all. */ \
+ "misa=*:-m %*" \
+ /* Separator. */ \
+ "; " \
+ /* Implicit misa=sm_30. */ \
+ ":-m sm_35" \
+ "}"
#define TARGET_CPU_CPP_BUILTINS() nvptx_cpu_cpp_builtins ()