On Tue, Feb 13, 2024 at 10:42:52AM +0100, Jakub Jelinek wrote: > On Sat, Feb 10, 2024 at 10:05:34AM -0800, H.J. Lu wrote: > > > I bet it probably doesn't work properly for -mx32 (which defines > > > __x86_64__), CCing H.J. on that, but that is a preexisting issue > > > (and I don't have any experience with it; I guess one would either > > > need to add 4 bytes of padding after the func_ptr so that those > > > bits remain zeros as sizeof (void *) is 4, but presumably it would be > > > better to just use movl (but into %r10) and maybe the jmpl instead > > > of movabsq. > > > > Are there any testcases to exercise this code on Linux? > > Here is an untested attempt to implement it for -mx32 (well, I've compiled > it with -mx32 in libgcc by hand after stubbing > /usr/include/gnu/stubs-x32.h). > > Testcase could be something like: > > /* { dg-do run } */ > /* { dg-options "-ftrampoline-impl=heap" } */ > > __attribute__((noipa)) int > bar (int (*fn) (int)) > { > return fn (42) + 1; > } > > int > main () > { > int a = 0; > int foo (int x) { if (x != 42) __builtin_abort (); return ++a; } > if (bar (foo) != 2 || a != 1) > __builtin_abort (); > if (bar (foo) != 3 || a != 2) > __builtin_abort (); > a = 42; > if (bar (foo) != 44 || a != 43) > __builtin_abort (); > return 0; > } > but I must say I'm also surprised we have no tests for this in the > testsuite. Sure, we'd also need to add some effective target whether > -ftrampoline-impl=heap can be used for a link/runtime test or not. > > 2024-02-13 Jakub Jelinek <ja...@redhat.com> > > PR target/113855 > * config/i386/heap-trampoline.c (trampoline_insns): Use movabsq > instead of movabs in comments. Add -mx32 variant. >
It works on x32. I modified your patch to add IBT support and pad the trampoline to the multiple of 4 bytes. Thanks. H.J. --- 2024-02-13 Jakub Jelinek <ja...@redhat.com> H.J. Lu <hjl.to...@gmail.com> PR target/113855 * config/i386/heap-trampoline.c (trampoline_insns): Add IBT support and pad to the multiple of 4 bytes. Use movabsq instead of movabs in comments. Add -mx32 variant. --- libgcc/config/i386/heap-trampoline.c | 42 ++++++++++++++++++++++++++-- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/libgcc/config/i386/heap-trampoline.c b/libgcc/config/i386/heap-trampoline.c index 1df0aa06108..a8637dc92d3 100644 --- a/libgcc/config/i386/heap-trampoline.c +++ b/libgcc/config/i386/heap-trampoline.c @@ -30,28 +30,64 @@ void __gcc_nested_func_ptr_created (void *chain, void *func, void *dst); void __gcc_nested_func_ptr_deleted (void); #if __x86_64__ + +#ifdef __LP64__ static const uint8_t trampoline_insns[] = { - /* movabs $<func>,%r11 */ +#if defined __CET__ && (__CET__ & 1) != 0 + /* endbr64. */ + 0xf3, 0x0f, 0x1e, 0xfa, +#endif + + /* movabsq $<func>,%r11 */ 0x49, 0xbb, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, - /* movabs $<chain>,%r10 */ + /* movabsq $<chain>,%r10 */ 0x49, 0xba, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* rex.WB jmpq *%r11 */ - 0x41, 0xff, 0xe3 + 0x41, 0xff, 0xe3, + + /* Pad to the multiple of 4 bytes. */ + 0x90 }; +#else +static const uint8_t trampoline_insns[] = { +#if defined __CET__ && (__CET__ & 1) != 0 + /* endbr64. */ + 0xf3, 0x0f, 0x1e, 0xfa, +#endif + + /* movl $<func>,%r11d */ + 0x41, 0xbb, + 0x00, 0x00, 0x00, 0x00, + + /* movl $<chain>,%r10d */ + 0x41, 0xba, + 0x00, 0x00, 0x00, 0x00, + + /* rex.WB jmpq *%r11 */ + 0x41, 0xff, 0xe3, + + /* Pad to the multiple of 4 bytes. */ + 0x90 +}; +#endif union ix86_trampoline { uint8_t insns[sizeof(trampoline_insns)]; struct __attribute__((packed)) fields { +#if defined __CET__ && (__CET__ & 1) != 0 + uint8_t endbr64[4]; +#endif uint8_t insn_0[2]; void *func_ptr; uint8_t insn_1[2]; void *chain_ptr; uint8_t insn_2[3]; + uint8_t pad; } fields; }; -- 2.43.0