Nick Kledzik <[EMAIL PROTECTED]> writes: > On Jun 4, 2008, at 12:44 PM, Ian Lance Taylor wrote: >> Chris Lattner <[EMAIL PROTECTED]> writes: >> >>>> * The return value of lto_module_get_symbol_attributes is not >>>> defined. >>> >>> Ah, sorry about that. Most of the details are actually in the public >>> header. The result of this function is a 'lto_symbol_attributes' >>> bitmask. This should be more useful and revealing: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/lto.h?revision=HEAD&view=markup >> >> From an ELF perspective, this doesn't seem to have a way to indicate a >> common symbol, and it doesn't provide the symbol's type. > The current lto interface does return whether a symbol is > REGULAR, TENTATIVE, WEAK_DEF, or UNDEFINED. There is also > CODE vs DATA which could be used to indicate STT_FUNC vs STT_OBJECT.
By "type" I mean STT_FUNC or STT_OBJECT. I took CODE vs. DATA to refer to the section in which the symbol is defined (SHF_EXECINSTR vs. SHF_WRITE). But, you're right, with appropriate squinting CODE vs. DATA is probably adequate. > I see you have your gold hat on here! The current interface is > simple and clean. If it does turn out that repeated calls to > lto_module_get_symbol* > are really a bottleneck, we could add a "bulk" function. I would like to add the bulk function now, because I know that we will want it. >>>> The LLVM >>>> interface does not do that. >>> >>> Yes it does, the linker fully handles symbol resolution in our model. >>> >>>> Suppose the linker is invoked on a >>>> sequence of object files, some with with LTO information, some >>>> without, all interspersed. Suppose some symbols are defined in >>>> multiple .o files, through the use of common symbols, weak symbols, >>>> and/or section groups. The LLVM interface simply passes each object >>>> file to the plugin. >>> >>> No, the native linker handles all the native .o files. >>> >>>> The result is that the plugin is required to do >>>> symbol resolution itself. This 1) loses one of the benefits of >>>> having >>>> the linker around; 2) will yield incorrect results when some non-LTO >>>> object is linked in between LTO objects but redefines some earlier >>>> weak symbol. >>> >>> In the LLVM LTO model, the plugin only needs to know about its .o >>> files, and the linker uses this information to reason about symbol >>> merging etc. The Mac OS X linker can even do dead code stripping >>> across Macho .o files and LLVM .bc files. >> >> To be clear, when I said object file here, I meant any input file. >> You may have understood that. >> >> In ELF you have to think about symbol overriding. Let's say you link >> a.o b.o c.o. a.o has a reference to symbol S. b.o has a strong >> definition. c.o has a weak definition. a.o and c.o have LTO >> information, b.o does not. ELF requires that a.o call the symbol from >> b.o, not the symbol from c.o. I don't see how to make that work with >> the LLVM interface. > This does work. There are two parts to it. First the linker's master > symbol > table sees the strong definition of S in b.o and the weak in c.o and > decides to use the strong one from b.o. Second (because of that) the > linker > calls lto_codegen_add_must_preserve_symbol("S"). The LTO engine then > sees it has a weak global function S and it cannot inline those. Put > together > the LTO engine does generate a copy of S, but the linker throws it away > and uses the one from b.o. OK, for that case. But are you asserting that this works in all cases? Should I come up with other examples of mixing LTO objects with non-LTO objects using different types of symbols? Ian