Issue 98151
Summary CUDA: Host-side static class member leaks into PTX as extern global
Labels new issue
Assignees
Reporter mkuron
    Consider the following bit of code:
```c++
template <class _Elem>
class codecvt {
    static int id;
};

template <class _Elem>
int codecvt<_Elem>::id;

template class codecvt<char>;
```
Including this into a CUDA file (you don't even have to use the class in a device-side function or variable) leads to warnings like `ptxas warning : Unresolved extern variable '_ZN7codecvtIcE2idE' in whole program compilation, ignoring extern qualifier` or, when compiling with `-fgpu-rdc`, errors like these at link time: `nvlink error : Undefined reference to '_ZN7codecvtIcE2idE'`. The resulting PTX contains this line: `.extern .global .align 4 .u32 _ZN7codecvtIcE2idE;`.

This code pattern with the static member of a template class plus explicit instantiation appears several times throughout Microsoft's STL implementation, which e.g. makes it impossible to `#include <locale>` (and various other headers) when using CUDA on Windows. The code above was reduced from https://github.com/microsoft/STL/blob/vs-2019-16.9/stl/inc/xlocale.

Godbolt: https://godbolt.org/z/n1jzMrMqd

When compiling with `-O1`, the `GlobalOptPass` eliminates the unused variable from the device code and the problem goes away. NVCC does not show this issue, it never generates `.extern .global` symbols for host-side static class members. Clang appears to only be generating these extraneous `.extern .global` symbols for code that exactly follows the above pattern; eliminating the templates or even replacing the explicit instantiation with implicit instantiation makes the problem go away.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to