Re: [Patch] nvptx/mkoffload.cc: Add dummy proc for OpenMP rev-offload table [PR108098]

2023-04-04 Thread Tom de Vries via Gcc-patches
On 4/4/23 11:02, Thomas Schwinge wrote: Hi! Are we going to install such a work-around? Hi, LGTM. Thanks, - Tom Grüße Thomas On 2022-12-19T13:04:43+0100, I wrote: Hi! On 2022-12-16T17:19:00+0100, Tobias Burnus wrote: Seems to be a CUDA JIT issue A Nvidia Driver JIT issue, more

Re: [PATCH, nvptx, 1/2] Reimplement libgomp barriers for nvptx

2022-12-16 Thread Tom de Vries via Gcc-patches
On 9/21/22 09:45, Chung-Lin Tang wrote: Hi Tom, I had a patch submitted earlier, where I reported that the current way of implementing barriers in libgomp on nvptx created a quite significant performance drop on some SPEChpc2021 benchmarks:

Re: [PATCH, nvptx, 2/2] Reimplement libgomp barriers for nvptx: bar.red instruction support in GCC

2022-12-16 Thread Tom de Vries via Gcc-patches
On 9/21/22 09:45, Chung-Lin Tang wrote: Hi Tom, following the first patch. This new barrier implementation I posted in the first patch uses the 'bar.red' instruction. > Usually this could've been easily done with a single line of inline assembly. However I quickly realized that because the

Re: nvptx: In 'STARTFILE_SPEC', fix 'crt0.o' for '-mmainkernel' (was: [MentorEmbedded/nvptx-tools] Match standard 'ld' "search" behavior (PR #38))

2022-11-18 Thread Tom de Vries via Gcc-patches
On 11/19/22 00:25, Thomas Schwinge wrote: Hi! Re : On 2022-11-18T11:05:23-0800, I wrote: Actually, in GCC/nvptx target testing, this #38's commit 886a95faf66bf66a82fc0fe7d2a9fd9e9fec2820 "ld: Don't search for

[committed] Don't build readline/libreadline.a, when --with-system-readline is supplied

2022-10-21 Thread Tom de Vries via Gcc-patches
Hi, [ Committed as obvious as per https://gcc.gnu.org/legacy-ml/gcc-patches/2018-12/msg00299.html . ] https://sourceware.org/bugzilla/show_bug.cgi?id=18632 The bundled libreadline is always built, even if the system is ./configure'd --with-system-readline and the build libreadline.a is not

Re: Restore default 'sorry' 'TARGET_ASM_CONSTRUCTOR', 'TARGET_ASM_DESTRUCTOR' (was: [PATCH 1/3] STABS: remove -gstabs and -gxcoff functionality)

2022-10-10 Thread Tom de Vries via Gcc-patches
On 10/10/22 16:19, Thomas Schwinge wrote: With that, OK to push? FWIW, nvptx change looks in the obvious category to me. Thanks, - Tom

[PATCH] Add --without-makeinfo

2022-10-04 Thread Tom de Vries via Gcc-patches
Hi, Currently, we cannot build gdb without makeinfo installed. It would be convenient to work around this by using the configure flag MAKEINFO=/usr/bin/true or some such, but that doesn't work because top-level configure requires a makeinfo of at least version 4.7, and that version check fails

Re: [PING^5] nvptx: Allow '--with-arch' to override the default '-misa' (was: nvptx multilib setup)

2022-09-18 Thread Tom de Vries via Gcc-patches
On 8/6/22 21:20, Thomas Schwinge wrote: Hi Tom! Hi Thomas, thanks for doing this. Series approved. As I mentioned, I'm not completely happy with the multilib name, but I don't think it makes sense to post-pone approval for this. Thanks, - Tom Ping. Grüße Thomas On

Re: [committed][nvptx] Add uniform_warp_check insn

2022-09-14 Thread Tom de Vries via Gcc-patches
On 9/14/22 11:41, Thomas Schwinge wrote: Hi Tom! On 2022-02-01T19:31:27+0100, Tom de Vries via Gcc-patches wrote: Hi, On a GT 1030, with driver version 470.94 and -mptx=3.1 I run into: ... FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c \ -DACC_DEVICE_TYPE_nvidia=1

Re: [committed][nvptx] Add bar.warp.sync

2022-09-14 Thread Tom de Vries via Gcc-patches
On 9/14/22 11:41, Thomas Schwinge wrote: Hi Tom! On 2022-02-01T19:31:13+0100, Tom de Vries via Gcc-patches wrote: On a GT 1030 (sm_61), with driver version 470.94 I run into: ... FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c \ -DACC_DEVICE_TYPE_nvidia=1

[PING^2][PATCH][gdb/build] Fix build breaker with --enabled-shared

2022-09-06 Thread Tom de Vries via Gcc-patches
On 7/12/22 15:42, Tom de Vries wrote: [ dropped gdb-patches, since already applied there. ] On 6/27/22 15:38, Tom de Vries wrote: On 6/27/22 15:03, Tom de Vries wrote: Hi, When building gdb with --enabled-shared, I run into: ... ld: build/zlib/libz.a(libz_a-inffast.o): relocation

Re: [PATCH] nvptx: Silence unused variable warning

2022-09-06 Thread Tom de Vries via Gcc-patches
On 8/28/22 13:09, Jan-Benedict Glaw wrote: Hi! The nvptx backend defines ASM_OUTPUT_DEF along with ASM_OUTPUT_DEF_FROM_DECLS. Much like the rs6000 coff target, nvptx triggers an unused variable warning: /usr/lib/gcc-snapshot/bin/g++ -fno-PIE -c -g -O2 -DIN_GCC

Re: [PING] nvptx: forward '-v' command-line option to assembler, linker

2022-09-05 Thread Tom de Vries via Gcc-patches
On 6/7/22 17:41, Thomas Schwinge wrote: Subject: [PING] nvptx: forward '-v' command-line option to assembler, linker From: Thomas Schwinge Date: 6/7/22, 17:41 To: Tobias Burnus , , "Tom de Vries" Hi! On 2022-05-30T09:06:21+0200, Tobias Burnus wrote: On 29.05.22 22:49, Thomas

Re: [PING][PATCH][gdb/build] Fix build breaker with --enabled-shared

2022-07-12 Thread Tom de Vries via Gcc-patches
On 7/12/22 15:59, Iain Sandoe wrote: Hi Tom On 12 Jul 2022, at 14:42, Tom de Vries via Gcc-patches wrote: [ dropped gdb-patches, since already applied there. ] On 6/27/22 15:38, Tom de Vries wrote: On 6/27/22 15:03, Tom de Vries wrote: Hi, When building gdb with --enabled-shared, I run

[PING][PATCH][gdb/build] Fix build breaker with --enabled-shared

2022-07-12 Thread Tom de Vries via Gcc-patches
[ dropped gdb-patches, since already applied there. ] On 6/27/22 15:38, Tom de Vries wrote: On 6/27/22 15:03, Tom de Vries wrote: Hi, When building gdb with --enabled-shared, I run into: ... ld: build/zlib/libz.a(libz_a-inffast.o): relocation R_X86_64_32S against \    `.rodata' can not be

Re: [PATCH][gdb/build] Fix build breaker with --enabled-shared

2022-06-27 Thread Tom de Vries via Gcc-patches
On 6/27/22 15:03, Tom de Vries wrote: Hi, When building gdb with --enabled-shared, I run into: ... ld: build/zlib/libz.a(libz_a-inffast.o): relocation R_X86_64_32S against \ `.rodata' can not be used when making a shared object; recompile with -fPIC ld: build/zlib/libz.a(libz_a-inflate.o):

[PATCH][gdb/build] Fix build breaker with --enabled-shared

2022-06-27 Thread Tom de Vries via Gcc-patches
Hi, When building gdb with --enabled-shared, I run into: ... ld: build/zlib/libz.a(libz_a-inffast.o): relocation R_X86_64_32S against \ `.rodata' can not be used when making a shared object; recompile with -fPIC ld: build/zlib/libz.a(libz_a-inflate.o): warning: relocation against \

[PATCH][gdb/build] Fix gdbserver build with -fsanitize=thread

2022-06-25 Thread Tom de Vries via Gcc-patches
Hi, When building gdbserver with -fsanitize=thread (added to CFLAGS/CXXFLAGS) we run into: ... ld: ../libiberty/libiberty.a(safe-ctype.o): warning: relocation against \ `__tsan_init' in read-only section `.text' ld: ../libiberty/libiberty.a(safe-ctype.o): relocation R_X86_64_PC32 \ against

[committed][gdb/build] Fix build for gcc < 11

2022-06-15 Thread Tom de Vries via Gcc-patches
Hi, When building trunk on openSUSE Leap 15.3 with system gcc 7.5.0, I run into: ... In file included from ../bfd/bfd.h:46:0, from gdb/defs.h:37, from gdb/debuginfod-support.c:19: gdb/debuginfod-support.c: In function ‘bool debuginfod_is_enabled()’:

Re: libgomp nvptx plugin: Split 'PLUGIN_NVPTX_DYNAMIC' into 'PLUGIN_NVPTX_INCLUDE_SYSTEM_CUDA_H' and 'PLUGIN_NVPTX_LINK_LIBCUDA'

2022-05-12 Thread Tom de Vries via Gcc-patches
On 4/28/22 15:45, Thomas Schwinge wrote: Hi Tom! On 2022-04-08T09:35:44+0200, Tom de Vries wrote: On 4/8/22 00:27, Thomas Schwinge wrote: On 2017-01-13T19:11:23+0100, Jakub Jelinek wrote: Especially for distributions it is undesirable to need to have proprietary CUDA libraries and headers

Re: [committed][nvptx] Fix ASM_SPEC workaround for sm_30

2022-04-11 Thread Tom de Vries via Gcc-patches
On 4/7/22 16:17, Thomas Schwinge wrote: Hi! On 2022-03-31T09:40:47+0200, Tom de Vries via Gcc-patches wrote: Newer versions of CUDA no longer support sm_30, and nvptx-tools as currently doesn't handle that gracefully when verifying ( https://github.com/MentorEmbedded/nvptx-tools/issues/30

Re: libgomp nvptx plugin: Split 'PLUGIN_NVPTX_DYNAMIC' into 'PLUGIN_NVPTX_INCLUDE_SYSTEM_CUDA_H' and 'PLUGIN_NVPTX_LINK_LIBCUDA' (was: [PATCH] Allow building GCC with PTX offloading even without CUDA

2022-04-08 Thread Tom de Vries via Gcc-patches
On 4/8/22 00:27, Thomas Schwinge wrote: Hi! On 2017-01-13T19:11:23+0100, Jakub Jelinek wrote: Especially for distributions it is undesirable to need to have proprietary CUDA libraries and headers installed when building GCC. --- libgomp/plugin/configfrag.ac.jj 2017-01-13

Re: Proposal to remove '--with-cuda-driver' (was: [wwwdocs][patch] gcc-12: Nvptx updates)

2022-04-06 Thread Tom de Vries via Gcc-patches
On 4/5/22 17:14, Thomas Schwinge wrote: Hi! Still catching up with GCC/nvptx back end changes... %-) In the following I'm not discussing the patch to document "gcc-12: Nvptx updates", but rather one aspect of the "gcc-12: Nvptx updates" themselves. ;-) On 2022-03-30T14:27:41+0200, Tom de

Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-04 Thread Tom de Vries via Gcc-patches
On 4/4/22 13:07, Jakub Jelinek wrote: On Mon, Apr 04, 2022 at 01:05:12PM +0200, Tom de Vries wrote: 2022-04-04 Tom de Vries * testsuite/libgomp.fortran/examples-4/on_device_arch.c: Copy from parent dir. Wouldn't just ! { dg-additional-sources ../on_device_arch.c } work?

Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-04 Thread Tom de Vries via Gcc-patches
On 4/1/22 17:57, Tom de Vries wrote: On 4/1/22 17:38, Jakub Jelinek wrote: On Fri, Apr 01, 2022 at 05:34:50PM +0200, Tom de Vries wrote: Do you perhaps have an idea why it's failing? Because you call on_device_arch_nvptx () outside of !$omp target region, so unless the host device is NVPTX,

Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Tom de Vries via Gcc-patches
On 4/1/22 17:38, Jakub Jelinek wrote: On Fri, Apr 01, 2022 at 05:34:50PM +0200, Tom de Vries wrote: Do you perhaps have an idea why it's failing? Because you call on_device_arch_nvptx () outside of !$omp target region, so unless the host device is NVPTX, it will not be true. That bit does

Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Tom de Vries via Gcc-patches
On 4/1/22 14:28, Thomas Schwinge wrote: Hi Tom! On 2022-04-01T13:24:40+0200, Tom de Vries wrote: When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run into: ... FAIL:

[PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-01 Thread Tom de Vries via Gcc-patches
Hi, When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run into: ... FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \ -DGOMP_NVPTX_JIT=-O0 execution test FAIL:

[committed][libgomp, testsuite, nvptx] Fix dg-output test in vector-length-128-7.c

2022-04-01 Thread Tom de Vries via Gcc-patches
Hi, When running test-case libgomp.oacc-c-c++-common/vector-length-128-7.c on an RTX A2000 (sm_86) with driver 510.60.02 I run into: ... FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-7.c \ -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 \ output

[committed][nvptx, testsuite] Fix gcc.target/nvptx/alias-*.c on sm_80

2022-04-01 Thread Tom de Vries via Gcc-patches
Hi, When running test-cases gcc.target/nvptx/alias-*.c on target board nvptx-none-run/-misa=sm_80 we run into fails because the test-cases add -mptx=6.3, which doesn't support sm_80. Fix this by only adding -mptx=6.3 if necessary, and simplify the test-cases by using ptx_alias feature

[committed][nvptx, testsuite] Fix typo in gcc.target/nvptx/march.c

2022-03-31 Thread Tom de Vries via Gcc-patches
Hi, The dg-options line in gcc.target/nvptx/march.c: ... /* { dg-options "-march=sm_30"} */ ... currently doesn't have any effect because it's missing a space between '"' and '}'. Fix this by adding the missing space. Tested on nvptx. Committed to trunk. Thanks, - Tom [nvptx, testsuite] Fix

[committed][nvptx] Fix ASM_SPEC workaround for sm_30

2022-03-31 Thread Tom de Vries via Gcc-patches
Hi, Newer versions of CUDA no longer support sm_30, and nvptx-tools as currently doesn't handle that gracefully when verifying ( https://github.com/MentorEmbedded/nvptx-tools/issues/30 ). There's a --no-verify work-around in place in ASM_SPEC, but that one doesn't work when using -Wa,--verify on

[wwwdocs][patch] gcc-12: Nvptx updates.

2022-03-30 Thread Tom de Vries via Gcc-patches
[ was: Re: [wwwdocs][patch] gcc-12/changes.html: Document -misa update for nvptx ] On 3/3/22 13:27, Tobias Burnus wrote: The current wording, https://gcc.gnu.org/gcc-12/changes.html#nvptx , is outdated and (now wrongly) encourages to use -mptx=. Updated as follows. I've taken these changes

Re: [PATCH][nvptx, doc] Update misa and mptx, add march and march-map

2022-03-30 Thread Tom de Vries via Gcc-patches
On 3/30/22 11:02, Tobias Burnus wrote: On 30.03.22 10:03, Tom de Vries wrote: On 3/29/22 16:47, Tobias Burnus wrote: I think it would be useful to have additionally some wording for the (new in GCC 12/new since today) macros, [...] The macro is defined also if the option is not specified,

Re: [PATCH][nvptx, doc] Update misa and mptx, add march and march-map

2022-03-30 Thread Tom de Vries via Gcc-patches
On 3/29/22 16:47, Tobias Burnus wrote: On 29.03.22 16:28, Tobias Burnus wrote: On 29.03.22 15:39, Tom de Vries wrote: Any comments? I think it would be useful to have additionally some wording for the (new in GCC 12/new since today) macros, Agreed. i.e. something like: ---

Re: [PATCH][nvptx, doc] Update misa and mptx, add march and march-map

2022-03-30 Thread Tom de Vries via Gcc-patches
On 3/29/22 16:28, Tobias Burnus wrote: Hi Tom, On 29.03.22 15:39, Tom de Vries wrote: Any comments? +(e.g.@: @samp{sm_35}).  Valid architecture strings are @samp{sm_30}, +@samp{sm_35}, @samp{sm_53} @samp{sm_70}, @samp{sm_75} and +@samp{sm_80}.  The default target architecture is sm_30.

[committed][nvptx] Add __PTX_ISA_VERSION_{MAJOR,MINOR}__

2022-03-29 Thread Tom de Vries via Gcc-patches
Hi, Add preprocessor macros __PTX_ISA_VERSION_MAJOR__ and __PTX_ISA_VERSION_MINOR__. For the default 6.0, we have: ... $ echo | cc1 -E -dD - 2>&1 | grep PTX_ISA_VERSION #define __PTX_ISA_VERSION_MAJOR__ 6 #define __PTX_ISA_VERSION_MINOR__ 0 ... and for 3.1, we have: ... $ echo | cc1

[PATCH][nvptx, doc] Update misa and mptx, add march and march-map

2022-03-29 Thread Tom de Vries via Gcc-patches
Hi, Update nvptx documentation: - Use meaningful terms: "PTX ISA target architecture" and "PTX ISA version". - Remove invalid claim that "ISA strings must be lower-case". - Add missing sm_xx entries. - Fix default ISA. - Add march, copying misa doc. - Declare misa an march alias. - Add march-map.

[committed][nvptx] Update help text for m64

2022-03-29 Thread Tom de Vries via Gcc-patches
Hi, In the docs we have for m64: ... Ignored, but preserved for backward compatibility. Only 64-bit ABI is supported. ... But with --target-help, we have instead: ... $ gcc --target-help ... -m64Generate code for a 64-bit ABI. ... which could be interpreted as meaning that generating

[committed][nvptx] Add march-map

2022-03-29 Thread Tom de Vries via Gcc-patches
Hi, Say we have an sm_50 board, and we want to run a benchmark using the highest possible march setting. Currently there's march=sm_30, march=sm_35, march=sm_53, but no march=sm_50. So, we'd need to pick march=sm_35. Likewise, for a test script that handles multiple boards, we'd need a mapping

[committed][nvptx] Add march alias for misa

2022-03-29 Thread Tom de Vries via Gcc-patches
Hi, The target option misa has the following description: ... $ gcc --target-help 2>&1 | grep misa -misa= Specify the PTX ISA target architecture to use. ... The name misa is somewhat poorly chosen. It suggests that for a use -misa=sm_30, sm_30 is the name of a specific

[committed][nvptx] Improve help description of misa and mptx

2022-03-28 Thread Tom de Vries via Gcc-patches
Hi, Currently we have: ... $ gcc --target-help 2>&1 | egrep "misa|mptx" -misa= Specify the version of the ptx ISA to use. -mptx= Specify the version of the ptx version to use. Known PTX ISA versions (for use with the -misa= option): Known PTX

Re: [PATCH][libgomp, testsuite] Fix hardcoded libexec in plugin/configfrag.ac

2022-03-28 Thread Tom de Vries via Gcc-patches
On 3/28/22 14:04, Richard Biener wrote: On Mon, 28 Mar 2022, Andreas Schwab wrote: On Mär 28 2022, Richard Biener via Gcc-patches wrote: OK in principle, but I have no idea on how portable $(libexecdir:\$(exec_prefix)/%=%) is going to be? We already require GNU make, don't we? We

Re: [PATCH][libgomp, testsuite] Fix hardcoded libexec in plugin/configfrag.ac

2022-03-28 Thread Tom de Vries via Gcc-patches
On 3/28/22 10:49, Richard Biener wrote: On Mon, 28 Mar 2022, Tom de Vries wrote: Hi, When building an nvptx offloading configuration on openSUSE Leap 15.3, the site script /usr/share/site/x86_64-unknown-linux-gnu is activated, setting libexecdir to ${exec_prefix}/lib rather than

[PATCH][libgomp, testsuite] Fix hardcoded libexec in plugin/configfrag.ac

2022-03-28 Thread Tom de Vries via Gcc-patches
Hi, When building an nvptx offloading configuration on openSUSE Leap 15.3, the site script /usr/share/site/x86_64-unknown-linux-gnu is activated, setting libexecdir to ${exec_prefix}/lib rather than ${exec_prefix}/libexec: ... | # If user did not specify libexecdir, set the correct target: | #

Re: [PATCH][libgomp, testsuite] Scale down some OpenACC test-cases

2022-03-25 Thread Tom de Vries via Gcc-patches
On 3/25/22 13:35, Thomas Schwinge wrote: Hi! On 2022-03-25T13:08:52+0100, Tom de Vries wrote: On 3/25/22 11:04, Tobias Burnus wrote: On 25.03.22 10:27, Jakub Jelinek via Gcc-patches wrote: On Fri, Mar 25, 2022 at 10:18:49AM +0100, Tom de Vries wrote: [...] Fix this by scaling down the

Re: [PATCH][libgomp, testsuite] Scale down some OpenACC test-cases

2022-03-25 Thread Tom de Vries via Gcc-patches
On 3/25/22 11:04, Tobias Burnus wrote: On 25.03.22 10:27, Jakub Jelinek via Gcc-patches wrote: On Fri, Mar 25, 2022 at 10:18:49AM +0100, Tom de Vries wrote: [...] Fix this by scaling down the failing test-cases. Tested on x86_64-linux with nvptx accelerator. [...] Will defer to Thomas, as it

[PATCH][libgomp, testsuite] Scale down some OpenACC test-cases

2022-03-25 Thread Tom de Vries via Gcc-patches
Hi, When a display manager is running on an nvidia card, all CUDA kernel launches get a 5 seconds watchdog timer. Consequently, when running the libgomp testsuite with nvptx accelerator and GOMP_NVPTX_JIT=-O0 we run into a few FAILs like this: ... libgomp: cuStreamSynchronize error: the launch

Re: [PATCH][libatomic] Fix return value in libat_test_and_set

2022-03-24 Thread Tom de Vries via Gcc-patches
On 3/24/22 11:59, Jakub Jelinek wrote: On Thu, Mar 24, 2022 at 11:01:30AM +0100, Tom de Vries wrote: Shouldn't that be instead return (woldval & ((UWORD) -1 << shift)) != 0; or return (woldval & ((UWORD) ~(UWORD) 0 << shift)) != 0; ? Well, I used '(woldval & wval) == wval' based on

Re: [PATCH][libatomic] Fix return value in libat_test_and_set

2022-03-24 Thread Tom de Vries via Gcc-patches
On 3/24/22 10:02, Jakub Jelinek wrote: On Thu, Mar 24, 2022 at 09:28:15AM +0100, Tom de Vries via Gcc-patches wrote: Hi, On nvptx (using a Quadro K2000 with driver 470.103.01) I ran into this: ... FAIL: gcc.dg/atomic/stdatomic-flag-2.c -O1 execution test ... which mimimized to: ... #include

[PATCH][libatomic] Fix return value in libat_test_and_set

2022-03-24 Thread Tom de Vries via Gcc-patches
Hi, On nvptx (using a Quadro K2000 with driver 470.103.01) I ran into this: ... FAIL: gcc.dg/atomic/stdatomic-flag-2.c -O1 execution test ... which mimimized to: ... #include atomic_flag a = ATOMIC_FLAG_INIT; int main () { if ((atomic_flag_test_and_set) ()) __builtin_abort ();

[committed][nvptx] Use '%' as register prefix

2022-03-22 Thread Tom de Vries via Gcc-patches
Hi, The percentage sign as first character of a ptx identifier can be used to avoid name conflicts, e.g., between user-defined variable names and compiler-generated names. The insn nvptx_uniform_warp_check contains register names without '%' prefix, which potentially could lead to name conflicts

[committed][nvptx] Limit HFmode support to mexperimental

2022-03-22 Thread Tom de Vries via Gcc-patches
Hi, With PR104489 still open and end-of-stage-4 approaching, classify HFmode support as experimental, which is not enabled by default but can be enabled using -mexperimental. This fixes the nvptx build when the default sm_xx is set to sm_53 or higher. Note that we're not using -mfp16 or some

[committed][nvptx] Add mexperimental

2022-03-22 Thread Tom de Vries via Gcc-patches
Hi, Add new option -mexperimental. This allows, rather than developing a new feature to completion in a development branch, to develop a new feature on trunk, without disturbing trunk. The equivalent of the feature branch merge then becomes making the functionality available for

[committed][nvptx] Use .alias directive for mptx >= 6.3

2022-03-22 Thread Tom de Vries via Gcc-patches
Hi, Starting with ptx isa version 6.3, a ptx directive .alias is available. Use this directive to support symbol aliases, as far as possible. The alias support is off by default. It can be turned on using a switch -malias. Furthermore, for pre-sm_75, it's not effective unless the ptx version

[committed][nvptx] Add warp sync at simt exit

2022-03-22 Thread Tom de Vries via Gcc-patches
Hi, Consider this code (with N defined to 1024): ... float v = 0.0; #pragma omp target map(tofrom: v) #pragma omp parallel for simd for (int i = 0 ; i < N; i++) { #pragma omp atomic update v = v + 1.0; } ... It hangs when executing on target board

Re: [PING^2][PATCH][final] Handle compiler-generated asm insn

2022-03-21 Thread Tom de Vries via Gcc-patches
On 3/21/22 14:49, Richard Biener wrote: On Mon, Mar 21, 2022 at 12:50 PM Tom de Vries wrote: On 3/21/22 08:58, Richard Biener wrote: On Thu, Mar 17, 2022 at 4:10 PM Tom de Vries via Gcc-patches wrote: On 3/9/22 13:50, Tom de Vries wrote: On 2/22/22 14:55, Tom de Vries wrote: Hi

Re: [PING^2][PATCH][final] Handle compiler-generated asm insn

2022-03-21 Thread Tom de Vries via Gcc-patches
On 3/21/22 08:58, Richard Biener wrote: On Thu, Mar 17, 2022 at 4:10 PM Tom de Vries via Gcc-patches wrote: On 3/9/22 13:50, Tom de Vries wrote: On 2/22/22 14:55, Tom de Vries wrote: Hi, For the nvptx port, with -mptx-comment we have in pr53465.s: ... // #APP // 9 "gcc/test

Re: [PATCH][openmp] Set location for taskloop stmts

2022-03-18 Thread Tom de Vries via Gcc-patches
On 3/18/22 15:56, Jakub Jelinek wrote: On Fri, Mar 18, 2022 at 03:42:48PM +0100, Tom de Vries wrote: And for NVPTX we somehow lower the taskloop into GIMPLE_ASM or how we end up ICEing? In the nvptx backend, gen_comment (triggering not very frequently atm) uses gen_rtx_ASM_INPUT_loc with as

[committed][openmp] Fix SIMT reduction using TRUTH_{AND,OR}IF_EXPR

2022-03-18 Thread Tom de Vries via Gcc-patches
Hi, Consider test-case pr104952-1.c, included in this commit, containing: ... #pragma omp target map(tofrom:result) map(to:arr) #pragma omp simd reduction(||: result) ... When run on x86_64 with nvptx accelerator, the test-case either aborts or hangs. The reduction clause is translated by

Re: [PATCH][openmp] Set location for taskloop stmts

2022-03-18 Thread Tom de Vries via Gcc-patches
On 3/18/22 14:01, Jakub Jelinek wrote: On Fri, Mar 18, 2022 at 01:44:00PM +0100, Tom de Vries wrote: The test-case included in this patch contains: ... #pragma omp taskloop simd shared(a) lastprivate(myId) ... This is translated to 3 taskloop statements in gimple, visible with

[PATCH][openmp] Set location for taskloop stmts

2022-03-18 Thread Tom de Vries via Gcc-patches
Hi, The test-case included in this patch contains: ... #pragma omp taskloop simd shared(a) lastprivate(myId) ... This is translated to 3 taskloop statements in gimple, visible with -fdump-tree-gimple: ... #pragma omp taskloop private(D.2124) #pragma omp taskloop shared(a) shared(myId)

[PING^2][PATCH][final] Handle compiler-generated asm insn

2022-03-17 Thread Tom de Vries via Gcc-patches
On 3/9/22 13:50, Tom de Vries wrote: On 2/22/22 14:55, Tom de Vries wrote: Hi, For the nvptx port, with -mptx-comment we have in pr53465.s: ... // #APP // 9 "gcc/testsuite/gcc.c-torture/execute/pr53465.c" 1 // Start: Added by -minit-regs=3: // #NO_APP

PING**4 - [PATCH] middle-end: Support ABIs that pass FP values as wider integers.

2022-03-14 Thread Tom de Vries via Gcc-patches
On 3/2/22 20:18, Jeff Law via Gcc-patches wrote: On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote: On Mon, 28 Feb 2022, Tobias Burnus wrote: Ping**3 On 23.02.22 09:42, Tobias Burnus wrote: PING**2 for the ME review or at least comments to that patch, which fixes a build

[committed][nvptx] Use no,yes for attribute predicable

2022-03-10 Thread Tom de Vries via Gcc-patches
Hi, The documentation states about the predicable instruction attribute: ... This attribute must be a boolean (i.e. have exactly two elements in its list-of-values), with the possible values being no and yes. ... The nvptx port has instead: ... (define_attr "predicable" "false,true"

[committed][nvptx] Disable warp sync in simt region

2022-03-10 Thread Tom de Vries via Gcc-patches
Hi, I ran into a hang for this code: ... #pragma omp target map(tofrom: counter_N0) #pragma omp simd for (int i = 0 ; i < 1 ; i++ ) { #pragma omp atomic update counter_N0 = counter_N0 + 1 ; } ... This has to do with the nature of -muniform-simt. It has two modes of

[committed][nvptx] Handle unused result in nvptx_unisimt_handle_set

2022-03-10 Thread Tom de Vries via Gcc-patches
Hi, For an example: ... #pragma omp target map(tofrom: counter_N0) #pragma omp simd for (int i = 0 ; i < 1 ; i++ ) { #pragma omp atomic update counter_N0 = counter_N0 + 1 ; } ... I noticed that the result of the atomic update (%r30) is propagated: ... @%r33

[committed][nvptx] Use bit-bucket operand for atom insns

2022-03-10 Thread Tom de Vries via Gcc-patches
Hi, For an atomic fetch operation that doesn't use the result: ... __atomic_fetch_add (p64, v64, MEMMODEL_RELAXED); ... we currently emit: ... atom.add.u64 %r26, [%r25], %r27; ... Detect the REG_UNUSED reg-note for %r26, and emit instead: ... atom.add.u64 _, [%r25], %r27; ... Likewise for

[committed][nvptx] Use atom.and.b64 instead of atom.b64.and

2022-03-10 Thread Tom de Vries via Gcc-patches
Hi, The ptx manual prescribes the instruction format atom{.space}.op.type but the compiler currently emits: ... atom.b64.and %r31, [%r30], %r32; ... which uses the instruction format atom{.space}.type.op. Fix this by emitting instead: ... atom.and.b64 %r31, [%r30], %r32; ... Tested on

[committed][nvptx] Add multilib mptx=3.1

2022-03-10 Thread Tom de Vries via Gcc-patches
Hi, With commit 5b5e456f018 ("[nvptx] Build libraries with mptx=3.1") the intention was that the ptx isa version for all libraries was switched back to 3.1 using MULTILIB_EXTRA_OPTS, without changing the default 6.0. Further testing revealed that this is not the case, and some libs were still

[committed][nvptx] Restore default to sm_30

2022-03-10 Thread Tom de Vries via Gcc-patches
Hi, With commit 07667c911b1 ("[nvptx] Build libraries with misa=sm_30") the intention was that the sm_xx for all libraries was switched back to sm_30 using MULTILIB_EXTRA_OPTS, without changing the default sm_35. Testing on an sm_30 board revealed that still some libs were build with sm_35, so

[PING][PATCH][final] Handle compiler-generated asm insn

2022-03-09 Thread Tom de Vries via Gcc-patches
On 2/22/22 14:55, Tom de Vries wrote: Hi, For the nvptx port, with -mptx-comment we have in pr53465.s: ... // #APP // 9 "gcc/testsuite/gcc.c-torture/execute/pr53465.c" 1 // Start: Added by -minit-regs=3: // #NO_APP mov.u32 %r26, 0; // #APP //

[committed][nvptx] Build libraries with mptx=3.1

2022-03-03 Thread Tom de Vries via Gcc-patches
Hi, In gcc-5 to gcc-11, the ptx isa version was 3.1. On trunk, the default is now 6.0, which is also what will be the value in the libraries. Consequently, there may be setups with an older driver that worked with gcc-11, but will become unsupported with gcc-12. Fix this by building the

[committed][nvptx] Build libraries with misa=sm_30

2022-03-03 Thread Tom de Vries via Gcc-patches
Hi, In gcc-11, when specifying -misa=sm_30, an executable may still contain sm_35 code (due to libraries being built with the default -misa=sm_35), so it won't run on an sm_30 board. Fix this by building libraries with sm_30, as was the case in gcc-5 to gcc-10. Committed to trunk. Thanks, -

[committed][nvptx] Use --no-verify for sm_30

2022-03-03 Thread Tom de Vries via Gcc-patches
Hi, In PR97348, we ran into the problem that recent CUDA dropped support for sm_30, which inhibited the build when building with CUDA bin in the path, because the nvptx-tools assembler uses CUDA's ptxas to do ptx verification. To fix this, in gcc-11 the default sm_xx was moved from sm_30 to

[committed][nvptx] Add -mptx=_ in gcc.target/nvptx/smxx.c

2022-03-03 Thread Tom de Vries via Gcc-patches
Hi, With target board nvptx-none-run/-mptx=3.1 we run into: ... cc1: error: PTX version (-mptx) needs to be at least 4.2 to support \ selected -misa (sm_53)^M compiler exited with status 1 FAIL: gcc.target/nvptx/sm53.c (test for excess errors) ... Fix this by adding -mptx=_ in sm53.c and

[committed][nvptx] Handle DCmode in define_expand "omp_simt_xchg_{bfly,idx}"

2022-03-01 Thread Tom de Vries via Gcc-patches
Hi, For a test-case doing an openmp target simd reduction on a complex double: ... DOUBLE COMPLEX :: counter_N0 ... !$OMP TARGET SIMD reduction(+: counter_N0) ... we run into: ... during RTL pass: expand b.f90: In function ‘MAIN__._omp_fn.0’: b.f90:23:32: internal compiler error: in

[committed][nvptx] Add nvptx-gen.h and nvptx-gen.opt

2022-03-01 Thread Tom de Vries via Gcc-patches
Hi, Use nvptx-sm.def to generate new files nvptx-gen.h and nvptx-gen.opt, and: - include nvptx-gen.h in nvptx.h, and - add nvptx-gen.opt to extra_options (before nvptx.opt, in case that matters). Tested on nvptx. Committed to trunk. Thanks, - Tom [nvptx] Add nvptx-gen.h and nvptx-gen.opt

[committed][nvptx] Use nvptx-sm.def for t-omp-device

2022-03-01 Thread Tom de Vries via Gcc-patches
Hi, Add a script gen-omp-device-properties.sh that uses nvptx-sm.def to generate omp-device-properties-nvptx. Tested on x86_64 with nvptx accelerator. Committed to trunk. Thanks, - Tom [nvptx] Use nvptx-sm.def for t-omp-device gcc/ChangeLog: 2022-02-25 Tom de Vries *

[committed][nvptx] Add nvptx-sm.def

2022-03-01 Thread Tom de Vries via Gcc-patches
Hi, Add a file gcc/config/nvptx/nvptx-sm.def that lists all sm_xx versions used in the port, like so: ... NVPTX_SM(30, NVPTX_SM_SEP) NVPTX_SM(35, NVPTX_SM_SEP) NVPTX_SM(53, NVPTX_SM_SEP) NVPTX_SM(70, NVPTX_SM_SEP) NVPTX_SM(75, NVPTX_SM_SEP) NVPTX_SM(80,) ... and use it in various places using a

[committed][nvptx, testsuite] Add gcc.target/nvptx/sm*.c

2022-03-01 Thread Tom de Vries via Gcc-patches
Hi, Add a few test-cases that test passing each -misa=sm_xx version and verify that the proper __PTX_SM__ is defined. Tested on nvptx. Committed to trunk. Thanks, - Tom [nvptx, testsuite] Add gcc.target/nvptx/sm*.c gcc/testsuite/ChangeLog: 2022-02-25 Tom de Vries *

[committed][libgomp, testsuite, nvptx] Add -mptx=_ in declare-variant-3-sm*.c

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi, When running with target board unix/-foffload=-mptx=3.1, we run into: ... lto1: error: PTX version (-mptx) needs to be at least 4.2 to support \ selected -misa (sm_53)^M mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned \ 1 exit status^M compilation terminated.^M

[committed][nvptx, testsuite] Add -mptx=_ in nvptx.exp test-cases

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi, When running with target board nvptx-none-run/-mptx=3.1, I run into: ... cc1: error: PTX version (-mptx) needs to be at least 4.2 to support selected \ -misa (sm_53)^M compiler exited with status 1 FAIL: gcc.target/nvptx/atomic-store-1.c (test for excess errors) ... Fix this and similar

[committed][nvptx] Add -mptx=_

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi, Add an -mptx=_ value, that indicates the default ptx version. It can be used to undo an explicit -mptx setting, so this: ... $ gcc test.c -mptx=3.1 -mptx=_ ... has the same effect as: ... $ gcc test.c ... Tested on nvptx. Committed to trunk. Thanks, - Tom [nvptx] Add -mptx=_

[committed][nvptx, testsuite] Add -misa=sm_30 in nvptx/atomic-store-3.c

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi, When running with target board nvptx-none-run/-misa=sm_70 I run into: ... FAIL: gcc.target/nvptx/atomic-store-3.c scan-assembler-times st.global.u32 1 FAIL: gcc.target/nvptx/atomic-store-3.c scan-assembler-times st.global.u64 1 ... Fix this by adding an explicit -misa=sm_30 in the test-case.

[committed][nvptx, testsuite] Add -misa=sm_30 in nvptx/uniform-simt-2.c

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi, When running with target board nvptx-none-run/-misa=sm_53 we run into: ... cc1: error: PTX version (-mptx) needs to be at least 4.2 to support selected \ -misa (sm_53)^M compiler exited with status 1 FAIL: gcc.target/nvptx/uniform-simt-2.c (test for excess errors) ... Fix this by adding an

[committed][nvptx, testsuite] Add -misa=sm_35 in nvptx/rotate.c

2022-02-28 Thread Tom de Vries via Gcc-patches
Hi, When running with target board nvptx-none-run/-misa=sm_30 we run into: ... FAIL: gcc.target/nvptx/rotate.c scan-assembler-times shf.l.wrap.b32 1 FAIL: gcc.target/nvptx/rotate.c scan-assembler-times shf.r.wrap.b32 1 FAIL: gcc.target/nvptx/rotate.c scan-assembler-not and.b32 ... Fix this by

Re: [PATCH][libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

2022-02-24 Thread Tom de Vries via Gcc-patches
On 2/24/22 11:09, Jakub Jelinek wrote: On Thu, Feb 24, 2022 at 11:01:22AM +0100, Tom de Vries wrote: [ was: Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70 ] On 2/24/22 09:29, Tom de Vries wrote: I'll try to submit a patch with one or more test-cases. Hi, These test-cases exercise the omp

[PATCH][libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

2022-02-24 Thread Tom de Vries via Gcc-patches
[ was: Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70 ] On 2/24/22 09:29, Tom de Vries wrote: I'll try to submit a patch with one or more test-cases. Hi, These test-cases exercise the omp declare variant construct using the available nvptx isas. OK for trunk? Thanks, - Tom[libgomp,

Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-24 Thread Tom de Vries via Gcc-patches
On 2/22/22 17:03, Tobias Burnus wrote: Hi Tom, On 22.02.22 15:43, Tom de Vries wrote: On 2/17/22 18:24, Tobias Burnus wrote: --- a/gcc/config/nvptx/t-omp-device +++ b/gcc/config/nvptx/t-omp-device @@ -1,4 +1,4 @@ echo kind: gpu > $@ echo arch: nvptx >> $@ -    echo isa: sm_30 sm_35

[committed][nvptx] Add shf.{l,r}.wrap insn

2022-02-24 Thread Tom de Vries via Gcc-patches
On 2/23/22 12:40, Tom de Vries wrote: Hi, Ptx contains funnel shift operations shf.l.wrap and shf.r.wrap that can be used to implement 32-bit left or right rotate. Add define_insns rotlsi3 and rotrsi3. Currently testing. And committed. Thanks, - Tom [nvptx] Add shf.{l,r}.wrap insn

[committed][nvptx] Fix dummy location in gen_comment

2022-02-24 Thread Tom de Vries via Gcc-patches
On 2/23/22 12:58, Thomas Schwinge wrote: Hi! On 2022-02-23T12:14:57+0100, Tom de Vries via Gcc-patches wrote: [ Re: [committed][nvptx] Add -mptx-comment ] On 2/22/22 14:53, Tom de Vries wrote: Add functionality that indicates which insns are added by -minit-regs, such that for instance we

[PATCH][nvptx] Add shf.{l,r}.wrap insn

2022-02-23 Thread Tom de Vries via Gcc-patches
Hi, Ptx contains funnel shift operations shf.l.wrap and shf.r.wrap that can be used to implement 32-bit left or right rotate. Add define_insns rotlsi3 and rotrsi3. Currently testing. Thanks, - Tom [nvptx] Add shf.{l,r}.wrap insn gcc/ChangeLog: 2022-02-23 Tom de Vries *

[PATCH][nvptx] Fix dummy location in gen_comment

2022-02-23 Thread Tom de Vries via Gcc-patches
[ Re: [committed][nvptx] Add -mptx-comment ] On 2/22/22 14:53, Tom de Vries wrote: Hi, Add functionality that indicates which insns are added by -minit-regs, such that for instance we have for pr53465.s: ... // #APP // 9 "gcc/testsuite/gcc.c-torture/execute/pr53465.c" 1 //

Re: [committed][nvptx] Use nvptx_warpsync / nvptx_uniform_warp_check for -muniform-simt

2022-02-23 Thread Tom de Vries via Gcc-patches
On 2/23/22 10:06, Thomas Schwinge wrote: Hi Tom! This is me again, following along GCC/nvptx devlopment, and asking questions. ;-) Yes, thanks for that, that's useful :) On 2022-02-19T20:07:18+0100, Tom de Vries via Gcc-patches wrote: With the default ptx isa 6.0, we have for uniform

Re: [PATCH] middle-end: Support ABIs that pass FP values as wider integers.

2022-02-22 Thread Tom de Vries via Gcc-patches
On 2/22/22 17:08, Roger Sayle wrote: Hi Tom, I'll admit that I'd not myself considered the ABI issues when I initially proposed experimental HFmode support for the nvptx backend, and was surprised when I finally tracked down the source of the problem you'd reported: that libgcc spots HFmode

Re: [PATCH] nvptx: Back-end portion of a fix for PR target/104489.

2022-02-22 Thread Tom de Vries via Gcc-patches
On 2/11/22 11:38, Roger Sayle wrote: This one line fix/tweak is the back-end specific change for a fix for PR target/104489, that allows the ISA for GCC's nvptx backend to be bumped to sm_53.  The machine-independent middle-end pieces were posted here:

Re: [PATCH] middle-end: Support ABIs that pass FP values as wider integers.

2022-02-22 Thread Tom de Vries via Gcc-patches
On 2/9/22 21:12, Roger Sayle wrote: This patch adds middle-end support for target ABIs that pass/return floating point values in integer registers with precision wider than the original FP mode. An example, is the nvptx backend where 16-bit HFmode registers are passed/returned as (promoted to)

Re: [PING][PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

2022-02-22 Thread Tom de Vries via Gcc-patches
On 5/19/21 16:52, Tom de Vries wrote: On 4/23/21 6:48 PM, Tom de Vries wrote: On 4/23/21 5:45 PM, Alexander Monakov wrote: On Thu, 22 Apr 2021, Tom de Vries wrote: Ah, I see, agreed, that makes sense. I was afraid there was some fundamental problem that I overlooked. Here's an updated

Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-22 Thread Tom de Vries via Gcc-patches
On 2/17/22 18:24, Tobias Burnus wrote: diff --git a/gcc/config/nvptx/t-omp-device b/gcc/config/nvptx/t-omp-device index 8765d9f1881..4228218a424 100644 --- a/gcc/config/nvptx/t-omp-device +++ b/gcc/config/nvptx/t-omp-device @@ -1,4 +1,4 @@ omp-device-properties-nvptx:

  1   2   >