https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106818

--- Comment #8 from palmer at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #7)
> (In reply to baoshan from comment #6)
> > > really of unknown alignment then sharing the lui might not work.
> > Can you elaborate why shareing the lui might not work?

Unless I've managed to screw up some bit arithmetic here, it's just overflow
that we're not detecting at link time:

$ cat test.c 
extern char glob[4];

int _start(void) {
        int *i = (int *)glob;
        return *i;
}
$ cat glob.s 
.section .sdata
.balign 4096
.global empty
empty:
.rep 2046
.byte 0
.endr
.global glob
glob:
.byte 1, 2, 3, 4
$ riscv64-linux-gnu-gcc test.c glob.s -O3 -o test -static -fno-PIE
-mcmodel=medlow -mexplicit-relocs -nostdlib
$ riscv64-linux-gnu-objdump -d test
...
000000000001010c <_start>:
   1010c:       66c9                    lui     a3,0x12
   1010e:       7ff6c703                lbu     a4,2047(a3) # 127ff <glob+0x1>
   10112:       7fe6c603                lbu     a2,2046(a3)
   10116:       8006c783                lbu     a5,-2048(a3)
   1011a:       8016c503                lbu     a0,-2047(a3)
...

So that's going to load

a3 = 0x127ff 
a2 = 0x127fd
a5 = 0x11800
a6 = 0x11801

Which is wrong.

We can't detect it at link time because both relocations are being processed
correctly, they just don't know about each other (and really can't, because
there's nothing coupling them together).

> Linker relaxation not coming in and relaxing it to be use gp offsets instead.
> It is one of the worst parts of the riscv toolchain ...

Though this time linker relaxation is actually biting us twice:

First, it's masking this problem for small programs: if these accesses are all
within range of GP we end up producing executables that function fine, as the
relaxation calculates the full addresses to use as GP offsets.

Second, the GP relaxations just don't work when we share LUIs for
possibly-misaligned symbols because we delete the LUI if the first low-half is
within GP range.  For example:

$ cat glob.s 
.section .sdata
.global empty
empty:
.rep 4090
.byte 0
.endr
.global glob
glob:
.byte 1, 2, 3, 4
$ riscv64-linux-gnu-gcc test.c glob.s -O3 -o test -static -fno-PIE
-mcmodel=medlow -mexplicit-relocs --save-temps -nostdlib
$ riscv64-linux-gnu-objdump -d test
...
000000000001010c <_start>:
   1010c:       7fb1c703                lbu     a4,2043(gp) # 12127 <glob+0x1>
   10110:       7fa1c603                lbu     a2,2042(gp) # 12126 <glob>
   10114:       1286c783                lbu     a5,296(a3)
   10118:       1296c503                lbu     a0,297(a3)
...

We had that problem with the AUIPC->GP relaxation as well, but could fix it
there because the low half points to the high half.  Here I think there's also
nothing we can do in the linker, as there's no way to tell when the result of
the LUI is completely unused -- we could deal with simple cases like this, but
with control flow there's no way to handle all of them.

Reply via email to