To avoid indirect branch to internal functions, I am proposing to add a new relocation, R_X86_64_RELAX_GOTPCREL, to x86-64 psABI:
1. When branching to an external function, foo, compiler may generate call/jmp *foo@GOTRELAX(%rip) which generates R_X86_64_RELAX_GOTPCREL relocation, instead of call/jmp foo[@PLT] 2. When function foo is locally defined, linker converts call/jmp *foo@GOTRELAX(%rip) to nop call/jmp foo 3. Otherwise, linker treats R_X86_64_RELAX_GOTPCREL the same way as R_X86_64_GOTPCREL. For i386 psABI, we add R_386_RELAX_GOT32: 1. When branching to an external function, foo, in non-PIC mode, compiler may generate call/jmp *foo@GOTRELAX which generates R_386_RELAX_GOT32 relocation, instead of call/jmp foo and in PIC mode call/jmp *foo@GOTRELAX(%reg) which generates R_386_RELAX_GOT32 relocation and REG holds the address of GOT, instead of call/jmp foo@PLT 2. When function foo is locally defined, linker converts call/jmp *foo@GOTRELAX[(%reg)] to nop call/jmp foo 3. Otherwise, a. In PIC mode, linker treats R_386_RELAX_GOT32 the same way as R_386_GOT32 and "call/jmp *foo@GOTRELAX" is unsupported. b. In no-PIC mode, linker computes its relocation value as relocation value of R_386_GOT32 plus the address of GOT and converts call/jmp *foo@GOTRELAX(%reg) to call/jmp *foo@GOTRELAX if needed. This new relocation effectively turns off lazy binding on function, foo. For i386, compiler is free to choose any register to hold the address of GOT and there is no need to make EBX a fixed register when branching to an external function in PIC mode. With this new relocation, only a one-byte NOP prefix overhead is added when function, foo, which compiler determines is external, turns out to be local at link-time, because of -Bsymbolic or a definition in another input object file which compiler has no knowledge of. The new -fno-plt GCC option can use R_X86_64_RELAX_GOTPCREL and R_386_RELAX_GOT32 relocations if linker supports them to avoid indirect branch to internal functions. H.J.