Re: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc

2024-04-04 Thread Markus Wichmann
Am Fri, Apr 05, 2024 at 05:04:37AM + schrieb Thorsten Glaser:
> Should be correct:
>
>  /usr/libexec/gcc/s390x-linux-gnu/13/collect2 -fno-lto -dynamic-linker 
> /lib/ld-musl-s390x.so.1 -nostdlib -static -static -pie --no-dynamic-linker -o 
> mksh /usr/lib/s390x-linux-musl/rcrt1.o /usr/lib/s390x-linux-musl/crti.o 
> /usr/lib/gcc/s390x-linux-gnu/13/crtbeginS.o -L/usr/lib/s390x-linux-musl -L 
> /usr/lib/gcc/s390x-linux-gnu/13/. -z relro -z now --as-needed -z text 
> --eh-frame-hdr lalloc.o edit.o eval.o exec.o expr.o funcs.o histrap.o jobs.o 
> lex.o main.o misc.o shf.o syn.o tree.o var.o ulimit.o --start-group 
> /usr/lib/gcc/s390x-linux-gnu/13/libgcc.a 
> /usr/lib/gcc/s390x-linux-gnu/13/libgcc_eh.a -lc --end-group 
> /usr/lib/gcc/s390x-linux-gnu/13/crtendS.o /usr/lib/s390x-linux-musl/crtn.o
>
> HTH & HAND,
> //mirabilos

I may not really know what I am talking about, so take this with a grain
of salt, but isn't this missing a -Bsymbolic somewhere? Ironically, that
switch causes ld to not emit symbolic relocations. I seem to remember
reading long ago in Rich's initial -static-pie proposal that that was
one of the switches added to the linker command line.

In any case, the emission of non-relative relocations is the issue here,
and it is coming from the linker.

Ciao,
Markus



Re: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc

2024-04-04 Thread Thorsten Glaser
Markus Wichmann dixit:

>can check with readelf -r what the relocation types are. If they are not
>relative, they will not be processed.

Gotcha! They are all R_390_RELATIVE except for:

00045ff0  00110016 R_390_64  00042c58 u_ops + 70
00045ff8  00110016 R_390_64  00042c58 u_ops + 0
00047020  00110016 R_390_64  00042c58 u_ops + 80
00047088  00110016 R_390_64  00042c58 u_ops + 80
000470a8  00110016 R_390_64  00042c58 u_ops + b8
00047220  00110016 R_390_64  00042c58 u_ops + 80
00046900  00260016 R_390_64  00015af8 c_command + 0
00046940  00070016 R_390_64  00017238 c_exec + 0
00046ab0  00200016 R_390_64  00016a80 c_trap + 0
00047090  00250016 R_390_64  000430ac initvsn + 0
00047278  00550016 R_390_64  00047438 null_string + 2

That’s our missing strings.

>Is it possible you are linking in the wrong start file? gcc -v should
>output the command line it feeds to the linker.

Should be correct:

 /usr/libexec/gcc/s390x-linux-gnu/13/collect2 -fno-lto -dynamic-linker 
/lib/ld-musl-s390x.so.1 -nostdlib -static -static -pie --no-dynamic-linker -o 
mksh /usr/lib/s390x-linux-musl/rcrt1.o /usr/lib/s390x-linux-musl/crti.o 
/usr/lib/gcc/s390x-linux-gnu/13/crtbeginS.o -L/usr/lib/s390x-linux-musl -L 
/usr/lib/gcc/s390x-linux-gnu/13/. -z relro -z now --as-needed -z text 
--eh-frame-hdr lalloc.o edit.o eval.o exec.o expr.o funcs.o histrap.o jobs.o 
lex.o main.o misc.o shf.o syn.o tree.o var.o ulimit.o --start-group 
/usr/lib/gcc/s390x-linux-gnu/13/libgcc.a 
/usr/lib/gcc/s390x-linux-gnu/13/libgcc_eh.a -lc --end-group 
/usr/lib/gcc/s390x-linux-gnu/13/crtendS.o /usr/lib/s390x-linux-musl/crtn.o

HTH & HAND,
//mirabilos
-- 
„Cool, /usr/share/doc/mksh/examples/uhr.gz ist ja ein Grund,
mksh auf jedem System zu installieren.“
-- XTaran auf der OpenRheinRuhr, ganz begeistert
(EN: “[…]uhr.gz is a reason to install mksh on every system.”)



Re: [musl] Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc

2024-04-04 Thread Markus Wichmann
Hi,

in static-pie, relocations get processed in _start, before main() is
called. In musl, this is done by linking with rcrt1.o as start file
instead of crt1.o. And that file processes all relative relocations. You
can check with readelf -r what the relocation types are. If they are not
relative, they will not be processed.

What you are seeing seems indicative of missing relocation processing.
Is it possible you are linking in the wrong start file? gcc -v should
output the command line it feeds to the linker.

Ciao,
Markus



Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc

2024-04-04 Thread Thorsten Glaser
Dixi quod…

>Now I (or someone) is going to have to reduce that to a testcase, so

No success with that, unfortunately.

>But this does seem to be a toolchain bug: adding -static-pie to the
>glibc dynamic-pie link command and…
>
>(gdb) print initcoms
>$1 = {0xda494 "typeset", 0x0, 0x0, 0x0, 0xda494 "typeset", 0x0, 0xd942c 
>"HOME", 0xda7d8 "PATH",

Wait, what?

(gdb) b main
Breakpoint 1 at 0xd820: file ../../main.c, line 785.
(gdb) print initcoms
$1 = {0xda494 "typeset", 0x0, 0x0, 0x0, 0xda494 "typeset", 0x0, 0xd942c "HOME", 
0xda7d8 "PATH",
[…]
(gdb) r
Starting program: /home/tg/mksh-59c/builddir/full/mksh

Breakpoint 1, main (argc=1, argv=0x3ffa4d8) at ../../main.c:785
785 {
(gdb) print initcoms
$2 = {0x3fff7eda494 "typeset", 0x3fff7ee4548  "-r",
  0x3fff7ee4ae0  "KSH_VERSION=@(#)MIRBSD KSH R59 2024/02/01 +Debian", 
0x0, 0x3fff7eda494 "typeset",
[…]

While in musl:

(gdb) print initcoms
$1 = {0x414a4 "typeset", 0x0, 0x0, 0x0, 0x414a4 "typeset", 0x0, 0x40478 "HOME", 
0x41d42 "PATH",
[…]
(gdb) r
Starting program: /home/tg/mksh-59c/builddir/static-musl/mksh

Breakpoint 1, main (argc=1, argv=0x3ffa498) at ../../main.c:785
785 {
(gdb) print initcoms
$2 = {0x3fff7fc14a4 "typeset", 0x0, 0x0, 0x0, 0x3fff7fc14a4 "typeset", 0x0, 
0x3fff7fc0478 "HOME",
[…]

So the existing ones did get relocated, but the nullptrs stayed thusly.

Apparently, it *is* supported on glibc on s390x, mjt (qemu maintainer)
also said so in 2023.

bye,
//mirabilos
-- 
22:20⎜ The crazy that persists in his craziness becomes a master
22:21⎜ And the distance between the craziness and geniality is
only measured by the success 18:35⎜ "Psychotics are consistently
inconsistent. The essence of sanity is to be inconsistently inconsistent



Re: Bug#1068350: musl: miscompiles (runtime problems) on riscv64 and s390x with static-pie → seems to be a toolchain bug after all, it does too hit glibc

2024-04-04 Thread Thorsten Glaser
Dixi quod…

>Hmm, actually… I could… test whether that one fixes static-pie
>on zelenka. Or at least the same approach. I’ll get back with
>report from that.

Having looked at the spec file, the only extra things the stock
specs do that the overriding specs don’t is:

*link:
[…] %{!static|static-pie:--eh-frame-hdr} […] %{static-pie:-static -pie 
--no-dynamic-linker -z text} […]

instead of:

[…] %{static-pie:-static -pie --no-dynamic-linker} […]

The -Wl,-z,text makes TEXTRELs an error. Granted.
The -Wl,--eh-frame-hdr is added for anything that’s not a normal
static executable, however adding that to a musl build doesn’t
fix the problem either.

A bit of gdb-ing shows the problem, though: the source code has…

#define Ttypeset "typeset"
#define Tdr "-r"
//… (a variant of this is used for string sharing on ancient Unix)

static const char *initcoms[] = {
Ttypeset, Tdr, initvsn, NULL,
Ttypeset, Tdx, "HOME", TPATH, TSHELL, NULL,
  […]
};

It then iterates over these commands with:

for (wp = initcoms; *wp != NULL; wp++) {
c_builtin(wp);
while (*wp != NULL)
wp++;
}

This is where the extra output happens:

(gdb) print initcoms
$3 = {0x3fff7fc14a4 "typeset", 0x0, 0x0, 0x0, 0x3fff7fc14a4 "typeset", 0x0, 
0x3fff7fc0478 "HOME", 
[…]

Notice the nullptrs there where string pointers are expected.
It shows the same output when just loading the executable, i.e. this
isn’t a runtime issue.

Linking the exact same .o files with the exact same command minus
-static-pie gives:

(gdb) print initcoms
$1 = {0x103cb34 "typeset", 0x103e368  "-r", 
  0x103e73c  "KSH_VERSION=@(#)MIRBSD KSH R59 2024/02/01 +Debian", 0x0, 
0x103cb34 "typeset", 

But this does seem to be a toolchain bug: adding -static-pie to the
glibc dynamic-pie link command and…

(gdb) print initcoms
$1 = {0xda494 "typeset", 0x0, 0x0, 0x0, 0xda494 "typeset", 0x0, 0xd942c "HOME", 
0xda7d8 "PATH",

Now I (or someone) is going to have to reduce that to a testcase, so
we can detect static-pie viability before it’s committed to being used…

bye,
//mirabilos
-- 
Solange man keine schmutzigen Tricks macht, und ich meine *wirklich*
schmutzige Tricks, wie bei einer doppelt verketteten Liste beide
Pointer XORen und in nur einem Word speichern, funktioniert Boehm ganz
hervorragend.   -- Andreas Bogk über boehm-gc in d.a.s.r