zhouronghua wrote: > Yes, and my question was if we could manage this automatically somehow via > `-fdepfile-entry`. I don't know the subtleties here, but from what I can see > this PR duplicates the depfile and tracks them separately, which I don't > think is what we want. Ideally we have the single one for the host that also > knows the device dependencies in addition to the host dependencies.
I'm sorry for misunderstanding of the option of `-fdepfile-entry`. When we compile with clang like this: clang++-20 -c example.cu --cuda-gpu-arch=sm_75 -I/usr/local/cuda/include -D__CUDA_NO_TEXTURE_INTRINSICS__ -MD -MT example.o -MF example.d -save-temps -o example.cu.o -v The dep file for sm_75 kernel will be generated first, so you mean we generate an example.d first like this: "/usr/lib/llvm-20/bin/clang" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-pc-linux-gnu -dependency-file example.d ... ... and when compile the host file ,we add the option "-fdepfile-entry example.d" like this(The option "-dependency-file example.d" option is already present in the original. ): "/usr/lib/llvm-20/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -target-sdk-version=12.6 -fcuda-allow-variadic-functions -aux-triple nvptx64-nvidia-cuda -fdepfile-entry example.d -dependency-file example.d ... ... and clang will generate a new example.d depend on the old kernel example.d? What if compiling kernels for multiple architectures? For example, both for the sm75 architecture and the sm80 architecture? Does it become like this: "/usr/lib/llvm-20/bin/clang" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-pc-linux-gnu -target-cpu sm_75 -dependency-file example.d ... ... "/usr/lib/llvm-20/bin/clang" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-pc-linux-gnu -target-cpu sm_80 -dependency-file example.d -fdepfile-entry example.d -dependency-file example.d ... ... "/usr/lib/llvm-20/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -target-sdk-version=12.6 -fcuda-allow-variadic-functions -aux-triple nvptx64-nvidia-cuda -fdepfile-entry example.d -dependency-file example.d ... ... and the last example.d will contain all dep from sm_75 sm_80 and the host? I tried it, and currently using -fdepfile-entry example.d and -dependency-file example.d with the same filename doesn't seem to work. Does this mean we need to generate different suffixes (with the source file name as the prefix) for different kernel compilations? Like .sm_75.d or .sm_80.d ? Then, during the host preprocessing stage, we pass all kernel dependency files via -fdepfile-entryto trigger automatic merging in the Clang frontend. If this works, there's only one issue left: do we need to delete the .d files generated for kernel compilation? If we delete them, it's actually not much different from the current implementation. If we don't delete them, it shouldn't be a problem for Makefiles since compilation rules are handwritten anyway. But for automated build systems like Bazel and CMake, which automatically collect .d files for each source file to generate dependency rules, it would mean they need to handle an extra task: for multi-architecture compilation, they need to collect multiple .d files or there will be many redundent .d file left. https://github.com/llvm/llvm-project/pull/176072 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
