> I'd go with .gnu.target_lto* names (i.e. s/.gnu.lto/.gnu.target_lto/ > on the existing LTO section names if they are for the accelerator rather > than host). I guess that now we could go with any naming, as it's far from being finalized.
> I really have almost zero experience with LTO, but I don't see how you could > use any resolution for those sections. The resolution handling in the > linker will be for the host link, you want something along the lines of: > - if you find .gnu.target_lto* sections, feed those (one CU at a time) to > the target compiler driver with some magic option that it will use lto1 > backend and will read from the .gnu.target_lto* sections instead of > .gnu.lto*; and, at least when not -flto, you want it to just generate > assembly for the target, and let the target driver also invoke assembler I'm also not an expert in LTO - I dived into it just a couple of days ago. However, the general scheme of LTO work, as I see it, is following (as we have it now, without any offloading support): * collect2 calls ld with plugin lto_plugin * liblto_plugin check every linker's input file and if it contains .lto sections, the plugin claims this file for the further processing * when all input files are loaded (and thus checked), the plugin creates resolution file and calls lto_wrapper, passing to it all claimed files * lto_wrapper calls 'gcc -xlto -fwpa' (WPA phase) * WPA divides everything into several partitions, basing on the callgraph it creates * lto_wrapper calls 'gcc -xlto -fltrans' on the created partitions - here is when the compilation occurs * the resultant object files are feed back to the linker and it produces the final executable >From my POV, we could easily reuse this infrastructure, making almost no changes in it, because all we want is to call some external programs (target compiler, target linker), which don't affect host object-files and binaries at all. All they do will be in a separate files and the host infrastructure will never know about them. Also, nothing prevents us from doing link-time optimizations (in future) on the target code - we could run 'gcc_target -flto' from lto_wrapper, only adjusting it to use lto-frontend instead of C/C++/other language. > - collect all those target object files from the link, link them together > using target compiler driver, and feed back the resulting binary > or shared library into the host linking (some magic section in there) Why do we need to feed the target binary back to the host linking? The host program cannot directly call any routine from the target binary, so IMHO there is no point in linking them together, they are just separate executables. > But, the target support has to work even without -flto, and for > debuggability etc. reasons I wouldn't force compiling all the target code > together unless required by the target. Well, we could use 1-to-1 partitioning (meaning that routines from every input CU are placed in a separate partition). And the question about multi-target support here still remains open. Thanks, Michael > > Jakub