Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc
Rich Felker dixit: >Is there anything weird about how these objects were declared that >might have caused ld not to resolve them statically like it should? It >seems odd that these data symbols, but not any other ones, would be >left as symbolic relocations. I don’t think so? In I already posted the short version; the actual source is (mirrored): The initcoms array is here: https://github.com/MirBSD/mksh/blob/b0219da8e6dfc7b16e923e220dc6933c5ed9b326/main.c#L77 Tdr is defined at: https://github.com/MirBSD/mksh/blob/b0219da8e6dfc7b16e923e220dc6933c5ed9b326/sh.h#L3055 The u_ops array is declared a few lines above that and defined at: https://github.com/MirBSD/mksh/blob/b0219da8e6dfc7b16e923e220dc6933c5ed9b326/funcs.c#L160 initvsn is defined at… https://github.com/MirBSD/mksh/blob/b0219da8e6dfc7b16e923e220dc6933c5ed9b326/sh.h#L713 … with the EXTERN and E_INIT macros from… https://github.com/MirBSD/mksh/blob/b0219da8e6dfc7b16e923e220dc6933c5ed9b326/sh.h#L657 where main.c defines EXTERN, so the string is embedded into the file using it. Is there perhaps a misunderstanding with the gcc/binutils/glibc developers as to what static-pie is meant to be? bye, //mirabilos -- cool ein Ada Lovelace Google-Doodle. aber zum 197. Geburtstag? Hätten die nicht noch 3 Jahre warten können? bis dahin gibts google nicht mehr ja, könnte man meinen. wahrscheinlich ist der angekündigte welt- untergang aus dem maya-kalender die globale abschaltung von google ☺ und darum müssen die die doodles vorher noch raushauen
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc
On Fri, Apr 05, 2024 at 05:04:37AM +, Thorsten Glaser wrote: > Markus Wichmann dixit: > > >can check with readelf -r what the relocation types are. If they are not > >relative, they will not be processed. > > Gotcha! They are all R_390_RELATIVE except for: > > 00045ff0 00110016 R_390_64 00042c58 u_ops + 70 > 00045ff8 00110016 R_390_64 00042c58 u_ops + 0 > 00047020 00110016 R_390_64 00042c58 u_ops + 80 > 00047088 00110016 R_390_64 00042c58 u_ops + 80 > 000470a8 00110016 R_390_64 00042c58 u_ops + b8 > 00047220 00110016 R_390_64 00042c58 u_ops + 80 > 00046900 00260016 R_390_64 00015af8 c_command + 0 > 00046940 00070016 R_390_64 00017238 c_exec + 0 > 00046ab0 00200016 R_390_64 00016a80 c_trap + 0 > 00047090 00250016 R_390_64 000430ac initvsn + 0 > 00047278 00550016 R_390_64 00047438 null_string + 2 > > That’s our missing strings. Is there anything weird about how these objects were declared that might have caused ld not to resolve them statically like it should? It seems odd that these data symbols, but not any other ones, would be left as symbolic relocations. Rich
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc
* Thorsten Glaser [2024-04-05 05:04:37 +]: > Markus Wichmann dixit: > > >can check with readelf -r what the relocation types are. If they are not > >relative, they will not be processed. > > Gotcha! They are all R_390_RELATIVE except for: > > 00045ff0 00110016 R_390_64 00042c58 u_ops + 70 > 00045ff8 00110016 R_390_64 00042c58 u_ops + 0 > 00047020 00110016 R_390_64 00042c58 u_ops + 80 > 00047088 00110016 R_390_64 00042c58 u_ops + 80 > 000470a8 00110016 R_390_64 00042c58 u_ops + b8 > 00047220 00110016 R_390_64 00042c58 u_ops + 80 > 00046900 00260016 R_390_64 00015af8 c_command + 0 > 00046940 00070016 R_390_64 00017238 c_exec + 0 > 00046ab0 00200016 R_390_64 00016a80 c_trap + 0 > 00047090 00250016 R_390_64 000430ac initvsn + 0 > 00047278 00550016 R_390_64 00047438 null_string + 2 > > That’s our missing strings. this is not correct static pie. glibc handles symbolic relocs, but there should not be any non-local symbol in a static exe. you may want to check the symbol table. so s390 does not support static pie. (arguably the elf is correct, if you expect a full dynlinker in a static pie, but even then it's bad quality linker output)
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc
Am Fri, Apr 05, 2024 at 05:58:15AM + schrieb Thorsten Glaser: > Markus Wichmann dixit: > >In any case, the emission of non-relative relocations is the issue here, > >and it is coming from the linker. > > They are present in the glibc static-pie binary as well, though. > And tbh they look to me like “just plug the absolute address of > the symbol here, please”, which is perfectly fine for things like > an array of strings when the actual string has already its own symbol. > > (Disclaimer: I know… barely anything about Unix relocation types, > a bit more about those on DOS and even TOS.) > Then glibc's static-pie startup code also processes symbolic relocations. musl's doesn't. It only processes relative relocations. And changing this would require some massive reworking. We'd somehow have to put stage 2 of the dynamic linker into rcrt1.o. A symbolic lookup doesn't really make sense for a static executable outside of FDPIC. The only difference in address space possible is a relative offset. In order to do a symbolic relocation, you also need the symbol lookup stuff, which - granted - for a static PIE is probably very simple because there can be only one symbol table, but still. I thought the whole point of static-PIE support was to only leave relative relocations around. Ciao, Markus
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc
Markus Wichmann dixit: >I may not really know what I am talking about, so take this with a grain >of salt, but isn't this missing a -Bsymbolic somewhere? Ironically, that >switch causes ld to not emit symbolic relocations. I seem to remember >reading long ago in Rich's initial -static-pie proposal that that was >one of the switches added to the linker command line. When searching for which architectures support static PIE in the first place (sadly, there doesn’t seem a consistent list), I found one saying it’s no longer necessart after some point, so I didn’t check it. >In any case, the emission of non-relative relocations is the issue here, >and it is coming from the linker. They are present in the glibc static-pie binary as well, though. And tbh they look to me like “just plug the absolute address of the symbol here, please”, which is perfectly fine for things like an array of strings when the actual string has already its own symbol. (Disclaimer: I know… barely anything about Unix relocation types, a bit more about those on DOS and even TOS.) bye, //mirabilos -- When he found out that the m68k port was in a pretty bad shape, he did not, like many before him, shrug and move on; instead, he took it upon himself to start compiling things, just so he could compile his shell. How's that for dedication. -- Wouter, about my Debian/m68k revival
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc
Am Fri, Apr 05, 2024 at 05:04:37AM + schrieb Thorsten Glaser: > Should be correct: > > /usr/libexec/gcc/s390x-linux-gnu/13/collect2 -fno-lto -dynamic-linker > /lib/ld-musl-s390x.so.1 -nostdlib -static -static -pie --no-dynamic-linker -o > mksh /usr/lib/s390x-linux-musl/rcrt1.o /usr/lib/s390x-linux-musl/crti.o > /usr/lib/gcc/s390x-linux-gnu/13/crtbeginS.o -L/usr/lib/s390x-linux-musl -L > /usr/lib/gcc/s390x-linux-gnu/13/. -z relro -z now --as-needed -z text > --eh-frame-hdr lalloc.o edit.o eval.o exec.o expr.o funcs.o histrap.o jobs.o > lex.o main.o misc.o shf.o syn.o tree.o var.o ulimit.o --start-group > /usr/lib/gcc/s390x-linux-gnu/13/libgcc.a > /usr/lib/gcc/s390x-linux-gnu/13/libgcc_eh.a -lc --end-group > /usr/lib/gcc/s390x-linux-gnu/13/crtendS.o /usr/lib/s390x-linux-musl/crtn.o > > HTH & HAND, > //mirabilos I may not really know what I am talking about, so take this with a grain of salt, but isn't this missing a -Bsymbolic somewhere? Ironically, that switch causes ld to not emit symbolic relocations. I seem to remember reading long ago in Rich's initial -static-pie proposal that that was one of the switches added to the linker command line. In any case, the emission of non-relative relocations is the issue here, and it is coming from the linker. Ciao, Markus
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc
Markus Wichmann dixit: >can check with readelf -r what the relocation types are. If they are not >relative, they will not be processed. Gotcha! They are all R_390_RELATIVE except for: 00045ff0 00110016 R_390_64 00042c58 u_ops + 70 00045ff8 00110016 R_390_64 00042c58 u_ops + 0 00047020 00110016 R_390_64 00042c58 u_ops + 80 00047088 00110016 R_390_64 00042c58 u_ops + 80 000470a8 00110016 R_390_64 00042c58 u_ops + b8 00047220 00110016 R_390_64 00042c58 u_ops + 80 00046900 00260016 R_390_64 00015af8 c_command + 0 00046940 00070016 R_390_64 00017238 c_exec + 0 00046ab0 00200016 R_390_64 00016a80 c_trap + 0 00047090 00250016 R_390_64 000430ac initvsn + 0 00047278 00550016 R_390_64 00047438 null_string + 2 That’s our missing strings. >Is it possible you are linking in the wrong start file? gcc -v should >output the command line it feeds to the linker. Should be correct: /usr/libexec/gcc/s390x-linux-gnu/13/collect2 -fno-lto -dynamic-linker /lib/ld-musl-s390x.so.1 -nostdlib -static -static -pie --no-dynamic-linker -o mksh /usr/lib/s390x-linux-musl/rcrt1.o /usr/lib/s390x-linux-musl/crti.o /usr/lib/gcc/s390x-linux-gnu/13/crtbeginS.o -L/usr/lib/s390x-linux-musl -L /usr/lib/gcc/s390x-linux-gnu/13/. -z relro -z now --as-needed -z text --eh-frame-hdr lalloc.o edit.o eval.o exec.o expr.o funcs.o histrap.o jobs.o lex.o main.o misc.o shf.o syn.o tree.o var.o ulimit.o --start-group /usr/lib/gcc/s390x-linux-gnu/13/libgcc.a /usr/lib/gcc/s390x-linux-gnu/13/libgcc_eh.a -lc --end-group /usr/lib/gcc/s390x-linux-gnu/13/crtendS.o /usr/lib/s390x-linux-musl/crtn.o HTH & HAND, //mirabilos -- „Cool, /usr/share/doc/mksh/examples/uhr.gz ist ja ein Grund, mksh auf jedem System zu installieren.“ -- XTaran auf der OpenRheinRuhr, ganz begeistert (EN: “[…]uhr.gz is a reason to install mksh on every system.”)
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc
Hi, in static-pie, relocations get processed in _start, before main() is called. In musl, this is done by linking with rcrt1.o as start file instead of crt1.o. And that file processes all relative relocations. You can check with readelf -r what the relocation types are. If they are not relative, they will not be processed. What you are seeing seems indicative of missing relocation processing. Is it possible you are linking in the wrong start file? gcc -v should output the command line it feeds to the linker. Ciao, Markus
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie
On Thu, Apr 04, 2024 at 07:50:40PM +, Thorsten Glaser wrote: > Szabolcs Nagy dixit: > > >the next culprit is gcc (each target can have their own > > gcc-13_13.2.0-23 > > >static pie specs) or the way you invoked gcc (not visible > > As I wrote earlier, though with more flags. Dropping all the -D… > and -W… and -I… and other irrelevant ones: > > musl-gcc -Os -g -fPIE -fno-lto -fno-asynchronous-unwind-tables > -fno-strict-aliasing -fstack-protector-strong -fwrapv -c … > musl-gcc -Os -g -fPIE -fno-lto -fno-asynchronous-unwind-tables > -fno-strict-aliasing -fstack-protector-strong -fwrapv > -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -static -static-pie > -fno-lto -o mksh *.o > > Same for both. You can see the full log by activating the > [64]Installed and [71]Installed links respectively on > https://buildd.debian.org/status/package.php?p=mksh and > skipping to 'compilation of mksh in static-musl' to get to > the beginning of the configure phase for that. > > >are you sure static pie works on these targets? > > No ;-) That’s why I reported this issue. I had just > enabled it for the musl builds, as the security people > like that more than normal static. I seem to recall the musl-gcc wrapper does not handle static-pie right. A real cross toolchain should. If there's an easy fix for the wrapper I'd be happy to merge it. Rich
Bug#1068350: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie
Rich Felker dixit: >I seem to recall the musl-gcc wrapper does not handle static-pie >right. Hmm. Inhowfar? And it does seem to work fine on the other architectures. >A real cross toolchain should. I fear that that’s out of question for Debian. I’ve got a github action test setup for mksh though, which also uses jirutka/setup-alpine to set up chroots of Alpine Linux for various architectures and uses them to build natively under qemu-user. I could use that to check static-pie? IIUC, these use “a real cross toolchain”, if natively; qemu-user adds an extra potential failure dimension though… >If there's an easy fix for the wrapper I'd be happy to merge it. Together with the MIPS fix? Hmm, actually… I could… test whether that one fixes static-pie on zelenka. Or at least the same approach. I’ll get back with report from that. bye, //mirabilos -- I believe no one can invent an algorithm. One just happens to hit upon it when God enlightens him. Or only God invents algorithms, we merely copy them. If you don't believe in God, just consider God as Nature if you won't deny existence. -- Coywolf Qi Hunt