https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122280
--- Comment #18 from Benjamin Schulz <schulz.benjamin at googlemail dot com> --- As for the valgrind output of clang, valgrind of course can't access the gpu memory, so one would expect a bunch of pointers where it does not know whether memory was freed. However, it also sees a memory leak with ordinary mallog on glibc connected to libcuda. Since the example program only creates an stl vector which releases itself by its destructor, the malloc call on the host where valgrind says it is not freed must be something from cuda... (valgrind can of course also here have false positives, but it marks memory where it just does not have access to separately)... But for the gcc output, the messages that cuda symbols would not be found is is of course devastating. Probably this is due to the driver expecting cuda 13 code with sm_120 and gcc compiles cuda 12 code for sm_89 which may not be binary compatible, even if nvidia claims so? Or it may be a problem with recent kernels 6.17.8 that have updated code for memory protection where cuda may have been lazy in the past? Could it be that gcc my mistake takes the sm related files in the system for clang and then compiles wrong? probably not.... When compiling with gcc, LD_LIBRARY_PATH is set tho LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/REDIST/compilers/lib/lib: When compiling with clang, I have to set it to: LD_LIBRARY_PATH=/usr/lib64/nvptx64-nvidia-cuda/ as otherwise clang complains it wont find a necessary .bc file. But then the clang output works correctly... The output of gcc produces that it would not find some cuda symbols and yields nonsense output...
