Applied, thanks! Sergey Bugaev, le sam. 29 avril 2023 23:18:21 +0300, a ecrit: > Normally, in static builds, the first code that runs is _start, in e.g. > sysdeps/x86_64/start.S, which quickly calls __libc_start_main, passing > it the argv etc. Among the first things __libc_start_main does is > initializing the tunables (based on env), then CPU features, and then > calls _dl_relocate_static_pie (). Specifically, this runs ifunc > resolvers to pick, based on the CPU features discovered earlier, the > most suitable implementation of "string" functions such as memcpy. > > Before that point, calling memcpy (or other ifunc-resolved functions) > will not work. > > In the Hurd port, things are more complex. In order to get argv/env for > our process, glibc normally needs to do an RPC to the exec server, > unless our args/env are already located on the stack (which is what > happens to bootstrap processes spawned by GNU Mach). Fetching our > argv/env from the exec server has to be done before the call to > __libc_start_main, since we need to know what our argv/env are to pass > them to __libc_start_main. > > On the other hand, the implementation of the RPC (and other initial > setup needed on the Hurd before __libc_start_main can be run) is not > very trivial. In particular, it may (and on x86_64, will) use memcpy. > But as described above, calling memcpy before __libc_start_main can not > work, since the GOT entry for it is not yet initialized at that point. > > Work around this by pre-filling the GOT entry with the baseline version > of memcpy, __memcpy_sse2_unaligned. This makes it possible for early > calls to memcpy to just work. The initial value of the GOT entry is > unused on x86_64, and changing it won't interfere with the relocation > being performed later: once _dl_relocate_static_pie () is called, the > baseline version will get replaced with the most suitable one, and that > is what subsequent calls of memcpy are going to call. > > Checked on x86_64-gnu. > > Signed-off-by: Sergey Bugaev <buga...@gmail.com> > --- > Changes since v1: > - drop the stpncpy, since it's apparently not required during early > startup; > - as a result of the above, there are no longer any changes to the > i386 version; > - drop the PIC/non-PIC split, we can always use %rip-relative addressing > on x86_64; > - as mentioned somewhere in the v1 thread, I have, since posting the v1, > actually gone and checked that the relocations do work and the proper, > more effecient memcpy version does get installed into the GOT slot and > invoked whenever anything calls memcpy; > - convinced myself that this is not a terrible hack but rather an OK > solution; > - worked out how this would be done on an architecture that (like i386, > unlike x86_64) does need the original value in the GOT to perform the > relocation, but (unlike i386, like x86_64) still uses an ifunc-selected > memcpy in static builds: namely, we'd simply put the original ifunc > address back into the GOT slot a few lines below, after the call to > _hurd_stack_setup. > > sysdeps/mach/hurd/x86_64/static-start.S | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/sysdeps/mach/hurd/x86_64/static-start.S > b/sysdeps/mach/hurd/x86_64/static-start.S > index 982d3d52..cc8e2410 100644 > --- a/sysdeps/mach/hurd/x86_64/static-start.S > +++ b/sysdeps/mach/hurd/x86_64/static-start.S > @@ -19,6 +19,9 @@ > .text > .globl _start > _start: > + > + leaq __memcpy_sse2_unaligned(%rip), %rax > + movq %rax, memcpy@GOTPCREL(%rip) > call _hurd_stack_setup > xorq %rdx, %rdx > jmp _start1 > -- > 2.40.1 >
-- Samuel --- Pour une évaluation indépendante, transparente et rigoureuse ! Je soutiens la Commission d'Évaluation de l'Inria.