https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122281

--- Comment #23 from Benjamin Schulz <schulz.benjamin at googlemail dot com> ---
Created attachment 62847
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62847&action=edit
arraytest-compute-sanitizer-log.txt

the program arraytest_mpi, when compiled with gcc appears to run fine at first.
(I.e. I see my gpu running and it returns the correct values without errors)


Attaching compute sanitizer with

compute-sanitizer --tool memcheck mpirun -np 12 ./arraytest_mpi  >
/home/benni/arraytest_mpi-compute-sanitizer-log.txt


shows the following errors:
At first the typical:

========= Program hit CUDA_ERROR_INVALID_CONTEXT (error 201) due to "invalid
device context" on CUDA API call to cuCtxGetDevice.
=========     Saved host backtrace up to driver entry point at error

and then this:

========= Program hit CUDA_ERROR_INVALID_VALUE (error 1) due to "invalid
argument" on CUDA API call to cuMemRetainAllocationHandle.
=========     Saved host backtrace up to driver entry point at error

With OpenMP, I do not call cuda functions directly myself in any way. 

In my code, I use omp_target_alloc and omp_target_free with correct arguments
(bytenumbers and correct devicenum). 

I get some feeling that this is all related to a transition to cuda 13 on the
driver level, where cuda 13 code is expected and cuda 13 functions are to be
called, and the compilation with gcc is still for cuda 12. 

and maybe due to an incompatibility with a kernel that looks out for memory
problems where cuda may have been a bit lazy in the implementation...

But I don't know really...

Reply via email to