[Bug fortran/97070] Discrepancy in results between OpenMP/OpenACC

2020-10-05 Thread venetis at ceid dot upatras.gr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97070

--- Comment #5 from Ioannis E. Venetis  ---
I am sorry for coming back to this and for the confusion, but my previous
report of having solved the problem proved wrong.

I was getting the correct result because the code was running on the CPU, not
the GPU.

After some more experimentation and setting GOMP_DEBUG=1 I am now certain that
the code runs on the GPU, but the results are wrong for OpenACC. Hence, the
problem remains.

[Bug fortran/97070] Discrepancy in results between OpenMP/OpenACC

2020-10-04 Thread venetis at ceid dot upatras.gr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97070

--- Comment #4 from Ioannis E. Venetis  ---
It seems that the problem was indeed some kind of confusion due to multiple gcc
installs. I removed the nvptx related packages of gcc-9.3.0 from my Ubuntu
16.04.7 LTS system (packages gcc-9-offload-nvptx, libgomp-plugin-nvptx1 and
nvptx-tools). After recompiling the test application I get the correct results
with OpenACC (and with OpenMP).

Why this problem happens and why it only affects OpenACC might still be
interesting to investigate.

[Bug fortran/97070] Discrepancy in results between OpenMP/OpenACC

2020-09-28 Thread venetis at ceid dot upatras.gr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97070

--- Comment #3 from Ioannis E. Venetis  ---
This is weird. Just downloaded gcc from git and built version 11.0

$ /home/venetis/apps/gcc-20200928/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/venetis/apps/gcc-20200928/bin/gcc
COLLECT_LTO_WRAPPER=/home/venetis/apps/gcc-20200928/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-20200928/configure --enable-offload-targets=nvptx-none
--with-cuda-driver-include=/usr/local/cuda/include
--with-cuda-driver-lib=/usr/local/cuda/lib64 --disable-bootstrap
--disable-multilib --enable-languages=c,c++,fortran,lto
--prefix=/home/venetis/apps/gcc-20200928
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.0 20200928 (experimental) (GCC)

I am still getting the wrong results with OpenACC. I can see three
possibilities.

1) I build gcc the wrong way. I have attached the script I am using to build
gcc. It is a slightly modified version of what I found here:
https://gist.github.com/matthiasdiener/e318e7ed8815872e9d29feb3b9c8413f

I have created manually a tarball of the code downloaded from git so as to make
minimal changes in the script I had.

2) The wrong run-time libraries are used during execution of the example, since
gcc is installed in a non-default path. I have tried with and without setting:
LD_LIBRARY_PATH=/home/venetis/apps/gcc-20200928/lib:/home/venetis/apps/gcc-20200928/lib64

Unfortunately I get wrong results in both cases.

3) Wrong nvptx tools and libraries are used during compilation of the example,
as my system (Ubuntu 16.04.7 LTS) has also the corresponding packages for gcc
9.3.0 installed.

How can I make certain that my compilation and execution of the example are
using all tools and libraries from my custom build?

PS: As a side note, I tried OpenACC with nvfortran 20.7 from NVidia HPC SDK
20.7 and I get the correct results for the example.

[Bug fortran/97070] Discrepancy in results between OpenMP/OpenACC

2020-09-28 Thread venetis at ceid dot upatras.gr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97070

--- Comment #2 from Ioannis E. Venetis  ---
Created attachment 49279
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49279&action=edit
GCC building script

[Bug fortran/97070] Discrepancy in results between OpenMP/OpenACC

2020-09-27 Thread dominiq at lps dot ens.fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97070

Dominique d'Humieres  changed:

   What|Removed |Added

   Last reconfirmed||2020-09-27
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
I don't see that with 10.2.1 or 11.0, the output for both cases is

ISOUR =   * XMO =1. DCP =2. IS1 =  
 3 IS2 =   24