================
@@ -548,6 +551,12 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles,
const ArgList &Args,
if (!Triple.isNVPTX() && !Triple.isSPIRV())
CmdArgs.push_back("-Wl,--no-undefined");
+ // The device inputs are bitcode stored in files with an object extension.
+ // Force the IR input language so Clang runs the compile and backend phases
+ // instead of treating them as linker inputs, which would defer codegen to
+ // the LTO link and defeat the non-LTO pipeline.
+ if (NonLTOAMDGPU)
+ CmdArgs.append({"-x", "ir"});
----------------
yxsamliu wrote:
Good point on PGO. The profile runtime isn't `-mlink`'d, so I now keep LTO when
`-fprofile-generate` is set — only plain non-RDC takes the non-LTO path, so
profile generation still links and optimizes the runtime as before. This does
highlight the real gap you mentioned: non-RDC non-LTO can't link device-side
compiler-rt libraries properly, which is part of why the unified RDC/non-RDC
interface in the FIXME would help.
https://github.com/llvm/llvm-project/pull/201135
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits