https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593

--- Comment #6 from Fangrui Song <i at maskray dot me> ---
(In reply to Alexander Monakov from comment #5)
> Hm, I still don't think I'm misunderstanding what you're saying. I'm
> familiar with the ELF standard (and FWIW I have read your blog posts on
> related matters). I am responding to this sentiment from the opening comment:
> 
> > I believe ld -Bsymbolic-functions can materialize most of the savings other
> > implementations provide, without introducing complex things to ELF.
> > However, since -Bsymbolic-functions doesn't play well with -fno-pic's
> > canonical PLT entries, we should fix -fno-pic.
> 
> I am saying that fixing -fno-pic is not the only possible way forward.
> Rather, a restricted -Bsymbolic-functions that relaxes relocations that are
> not address-significant allows to still get some (but not all) of the
> benefits for unchanged -fno-pic executables.

You are right. A pure linker approach is possible. However, I think the
approach is inelegant, because the linker would have different preemptibility
ideas on
different relocation types and (as you said) indirect calls like vtable
definitions
are not optimized.

Let's say the proposed linker option for shared objects is -Bsymbolic-plt.
The discussion below focuses on default visibility definitions which would
otherwise be preemptible.

Let categorize relocation types first.

PLT-generating: R_X86_64_PLT32
GOT-generating: R_X86_64_GOTPCREL, R_X86_64_GOTPCRELX, R_X86_64_REX_GOTPCRELX
absolute (symbolic): R_X86_64_64

There are three choices.

(a) If all relocation types are PLT-generating, bind branch targets directly
and suppress the PLT entry.
If GOT-generating/absolute relocations are present, don't change behaviors.
This choice is less effective for some otherwise address-insignificant
functions, e.g. non-vague-linkage virtual functions.

b) If all relocation types are R_X86_64_PLT32 or GOT-generating, bind branch
targets directly and suppress the PLT entry.
If GOT-generating relocations are present, produce a GOT entry and an
associated R_X86_64_GLOB_DAT.
If absolute relocations are present,  don't change behaviors.

c) Always bind branch targets directly and suppress the PLT entry.
If GOT-generating relocations are present, produce a GOT entry and an
associated R_X86_64_GLOB_DAT.
If absolute relocations are present, produce outstanding dynamic relocations of
the same type.


> > You misunderstand this. Emitting GOT-generating relocation in -fno-pic mode
> > is the only way to avoid canonical PLT entry, if the function turns out to
> > be defined in a shared object. No -Bsymbolic variant can make this
> > compatible.
> 
> Well, if you frame the goal as "eliminate canonical PLT entries", then yes,
> but that in itself surely is not the end goal? The end goals are reducing
> startup time (which my idea helps only partially since it may bind direct
> calls but not e.g. vtable definitions) and runtime overheads (where again my
> proposal is weaker but not significantly so, assuming address loads are
> rarely on hot paths).

Yes, the end goal is to reduce startup time and bind call targets directly if
feasible.
Yes, -Bsymbolic-plt can help the goal partially.

> 
> To clarify once more. I am not outright rejecting the idea in your opening
> comment. I am saying that there potentially is a lighter-weight alternative,
> which may be implementable purely in the linker, and still gets most of the
> benefit you're promoting (like in your Clang example). Which is nice,
> because it can be rolled out sooner, individual libraries/distros/users can
> opt-in and experiment as they like, etc.

Such a -Bsymbolic-plt can achieve some goals.
But given that the function pointer equality problems are usually benign
(-fno-pic is relatively uncommon in many areas; making use of such pointer
equality is not a common practice),
I'd hope we just don't add that intermediate linker option.

Reply via email to