On Sat, May 16, 2015 at 11:59:56AM -0700, H.J. Lu wrote:
> On Sat, May 16, 2015 at 7:19 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
> > On Fri, May 15, 2015 at 4:49 PM, Rich Felker <dal...@libc.org> wrote:
> >> On Fri, May 15, 2015 at 04:34:57PM -0700, H.J. Lu wrote:
> >>> On Fri, May 15, 2015 at 4:30 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
> >>> > On Fri, May 15, 2015 at 4:14 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
> >>> >> My relax branch proposal works even without LTO.
> >>> >>
> >>> >
> >>> > I will borrow GOTPCREL from x86-64 and do
> >>> >
> >>> > [hjl@gnu-6 relax-4]$ cat b.S
> >>> > call *foo@GOTPCREL(%eax)
> >>>
> >>> call *foo@GOTPLT(%eax)
> >>>
> >>> is a better choice.
> >>
> >> foo@GOTPCREL is preferable (but does not yet exist for ia32, so the
> >> reloc type would have to be added) since it saves a useless add.
> >> Instead of:
> >>
> >>         call __x86.get_pc_thunk.ax
> >>         addl $_GLOBAL_OFFSET_TABLE_, %eax
> >>         call *foo@GOTPLT(%eax)
> >>
> >> you can just do:
> >>
> >>         call __x86.get_pc_thunk.ax
> >>         call *foo@GOTPCREL(%eax)
> >>
> >> Note that it also works to have extra instructions between:
> >>
> >>         call __x86.get_pc_thunk.ax
> >> 1:      ...
> >>         call *foo@GOTPCREL+(1b-.)(%eax)
> >>
> >> I may not have gotten the syntax quite right, but hopefully yoy get
> >> the idea. This same approach (with GOTPCREL) can be used for _all_ GOT
> >> accesses, including global data, to eliminate the useless add.
> >>
> >
> > This is a good idea.  But I'd like to use something for both i386 and
> > x86-64.  I am proposing
> >
> > call/jmp *foo@GOTPCRELAX+addend(%reg)
> >
> > It is similar to @GOTPCREL, but with a new relax relocation.  Before
> > I can do that, I need to fix
> 
> It doesn't work.  REG must hold GOT base for other GOT relocations.
> We need to keep
> 
> addl $_GLOBAL_OFFSET_TABLE_, %eax

Like I just said, all foo@GOT(%gotreg) can be replaced with
foo@GOTPCREL+[label-.](%labelreg) where %labelreg is a register
pointing to the referenced label (the point at which the program
counter was saved). This is a minor but useful optimization that can
be made for all GOT accesses, not just ones for [relaxable] function
calls.

Rich

Reply via email to