https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122281
--- Comment #17 from Benjamin Schulz <schulz.benjamin at googlemail dot com> --- Interesting is when i run the clang compiled binaries with valgrind: ==10764== 16 bytes in 1 blocks are definitely lost in loss record 18 of 2,872 ==10764== at 0x55E68D8: malloc (vg_replace_malloc.c:447) ==10764== by 0x13D3CE47: ??? (in /usr/lib64/libcuda.so.580.105.08) ==10764== by 0x13E2C356: ??? (in /usr/lib64/libcuda.so.580.105.08) ==10764== by 0x13D28C22: ??? (in /usr/lib64/libcuda.so.580.105.08) ==10764== by 0x57BB5BE: start_thread (in /usr/lib64/libc.so.6) ==10764== by 0x584E283: clone (in /usr/lib64/libc.so.6) That does not look like its my fault, I try to clean up the memory I allocate... It also showed problems of "possible leaks", all in system files connected to libcuda (but that is probably since memory "vanishes" into the gpu?)... By now, I think I clean up everywhere when I use new, malloc or omp_target_alloc Also, the same problem happens in all my compiled binaries... When used with OpenMPI (Message passing Interface) and offloading, clang interestingly also gets into memory problems for any application (no matter which code, even a hello world) that uses offload with nvptx runtime and initializes OpenMPI. https://github.com/llvm/llvm-project/issues/162586 Should I bring all this to attention of nvidia? Perhaps they can also tell why it does not run correctly on gcc? This https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122280 may also be interesting for nvidia... I don't know...
