[Bug target/88981] [nvptx, openacc, libgomp] How to handle async regions without corresponding wait

2019-01-22 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88981

--- Comment #3 from Tom de Vries  ---
Thomas,

any comments to add from OpenACC perspective? What is correct or desirable
behaviour?

Thanks,
- Tom

[Bug target/88981] [nvptx, openacc, libgomp] How to handle async regions without corresponding wait

2019-01-22 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88981

--- Comment #2 from Tom de Vries  ---
A good thing to note here, when adding #pragma acc wait, the program (compiled
with -O0) takes ~10 seconds to finish on my quadro 1200m.

Without the pragma acc wait, it still takes 10 seconds.

When inspecting with a debugger where it's waiting (since there's no wait
reponsible for this), we're hanging on either cuMemFree or cuCtxDestroy.  I
can't find documentation of this hanging behaviour, so this behaviour may be
specific to the driver version or card or architecture.

[Bug target/88981] [nvptx, openacc, libgomp] How to handle async regions without corresponding wait

2019-01-22 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88981

Tom de Vries  changed:

   What|Removed |Added

   Keywords||openacc
 Target||nvptx
 CC||cltang at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #1 from Tom de Vries  ---
Chung-Lin, 

how would this test-case be handled using the async patch set for gcc 10 stage
1? Is there something done in the generic openacc code?

Thanks,
- Tom