On Sat, May 16, 2015 at 11:59:56AM -0700, H.J. Lu wrote: > On Sat, May 16, 2015 at 7:19 AM, H.J. Lu <hjl.to...@gmail.com> wrote: > > On Fri, May 15, 2015 at 4:49 PM, Rich Felker <dal...@libc.org> wrote: > >> On Fri, May 15, 2015 at 04:34:57PM -0700, H.J. Lu wrote: > >>> On Fri, May 15, 2015 at 4:30 PM, H.J. Lu <hjl.to...@gmail.com> wrote: > >>> > On Fri, May 15, 2015 at 4:14 PM, H.J. Lu <hjl.to...@gmail.com> wrote: > >>> >> My relax branch proposal works even without LTO. > >>> >> > >>> > > >>> > I will borrow GOTPCREL from x86-64 and do > >>> > > >>> > [hjl@gnu-6 relax-4]$ cat b.S > >>> > call *foo@GOTPCREL(%eax) > >>> > >>> call *foo@GOTPLT(%eax) > >>> > >>> is a better choice. > >> > >> foo@GOTPCREL is preferable (but does not yet exist for ia32, so the > >> reloc type would have to be added) since it saves a useless add. > >> Instead of: > >> > >> call __x86.get_pc_thunk.ax > >> addl $_GLOBAL_OFFSET_TABLE_, %eax > >> call *foo@GOTPLT(%eax) > >> > >> you can just do: > >> > >> call __x86.get_pc_thunk.ax > >> call *foo@GOTPCREL(%eax) > >> > >> Note that it also works to have extra instructions between: > >> > >> call __x86.get_pc_thunk.ax > >> 1: ... > >> call *foo@GOTPCREL+(1b-.)(%eax) > >> > >> I may not have gotten the syntax quite right, but hopefully yoy get > >> the idea. This same approach (with GOTPCREL) can be used for _all_ GOT > >> accesses, including global data, to eliminate the useless add. > >> > > > > This is a good idea. But I'd like to use something for both i386 and > > x86-64. I am proposing > > > > call/jmp *foo@GOTPCRELAX+addend(%reg) > > > > It is similar to @GOTPCREL, but with a new relax relocation. Before > > I can do that, I need to fix > > It doesn't work. REG must hold GOT base for other GOT relocations. > We need to keep > > addl $_GLOBAL_OFFSET_TABLE_, %eax
Like I just said, all foo@GOT(%gotreg) can be replaced with foo@GOTPCREL+[label-.](%labelreg) where %labelreg is a register pointing to the referenced label (the point at which the program counter was saved). This is a minor but useful optimization that can be made for all GOT accesses, not just ones for [relaxable] function calls. Rich