Dear GCC-list and Daniel,

Lately, I continued working on TLS for mips and a few things bother me.

Firstly, it seems the "-ftls-model" option for gcc is not completely
respected when compiling Position-Independent Executable ("-fpie").
Here is a sample of code for "app.c":

__thread int a = 0;
extern __thread int b;

By default, the chosen tls model is "initial-exec", which causes one
relocation for "b" (R_MIPS_TLS_TPREL32). If I specify "local-exec",
the behavior is as expected, I get no relocation at all. But whenever
I choose a dynamic model ("local-dynamic" or "global-dynamic"), I
still get a "R_MIPS_TLS_TPREL32" which is not a dynamic relocation.
Why this?
Therefore, I have to explicitely associate an attribute to the
variable to get the expected behavior, such as:
extern __thread int b __attribute__((tls_model("local-dynamic")));

Secondly, I observe a very strange calculation of GCC. Here is the code:

extern __thread int b __attribute__((tls_model("local-dynamic")));
int main()
{
    b++;
}

I compile this code with the following CFLAGS:

CFLAGS=-nostdinc -nostdlib -fno-builtin \
           -Wall \
           -fomit-frame-pointer \
           -mips2 -EL -mno-branch-likely \
           -mabicalls \
           -G0 \
           -ftls-model="local-dynamic" \
           -fpie
           -Os

and following LDFLAGS:
-pie -app.o -o app.x -l:libc.so -L.

(fyi, the "libc.so" defines the variable "b")

And I get the following generated code (fyi, the jalr jumps to __get_tls_addr):

5ffe03fc <main>:
5ffe03fc:       3c1c0001        lui     gp,0x1
5ffe0400:       279c9054        addiu   gp,gp,-28588
5ffe0404:       0399e021        addu    gp,gp,t9
5ffe0408:       27bdfff0        addiu   sp,sp,-16
5ffe040c:       afbf000c        sw      ra,12(sp)
5ffe0410:       afbc0000        sw      gp,0(sp)
5ffe0414:       8f99802c        lw      t9,-32724(gp)
5ffe0418:       27848030        addiu   a0,gp,-32720
5ffe041c:       0320f809        jalr    t9
5ffe0420:       00000000        nop
5ffe0424:       3c034003        lui     v1,0x4003
5ffe0428:       00621821        addu    v1,v1,v0
5ffe042c:       8c625760        lw      v0,22368(v1)
5ffe0430:       8fbf000c        lw      ra,12(sp)
5ffe0434:       8fbc0000        lw      gp,0(sp)
5ffe0438:       24420001        addiu   v0,v0,1
5ffe043c:       ac625760        sw      v0,22368(v1)
5ffe0440:       03e00008        jr      ra
5ffe0444:       27bd0010        addiu   sp,sp,16

The suspicious instruction is "lui   v1,0x4003". I don't understand
where does "0x4003" comes from...
Now, if I drop the size optimization, I get the following code:

5ffe040c <main>:
5ffe040c:       3c1c0001        lui     gp,0x1
5ffe0410:       279c9064        addiu   gp,gp,-28572
5ffe0414:       0399e021        addu    gp,gp,t9
5ffe0418:       27bdffe8        addiu   sp,sp,-24
5ffe041c:       afbf0014        sw      ra,20(sp)
5ffe0420:       afb00010        sw      s0,16(sp)
5ffe0424:       afbc0000        sw      gp,0(sp)
5ffe0428:       8f99802c        lw      t9,-32724(gp)
5ffe042c:       27848030        addiu   a0,gp,-32720
5ffe0430:       0320f809        jalr    t9
5ffe0434:       00000000        nop
5ffe0438:       8fbc0000        lw      gp,0(sp)
5ffe043c:       8c420000        lw      v0,0(v0)
5ffe0440:       24500001        addiu   s0,v0,1
5ffe0444:       8f99802c        lw      t9,-32724(gp)
5ffe0448:       27848030        addiu   a0,gp,-32720
5ffe044c:       0320f809        jalr    t9
5ffe0450:       00000000        nop
5ffe0454:       8fbc0000        lw      gp,0(sp)
5ffe0458:       ac500000        sw      s0,0(v0)
5ffe045c:       8fbf0014        lw      ra,20(sp)
5ffe0460:       8fb00010        lw      s0,16(sp)
5ffe0464:       27bd0018        addiu   sp,sp,24
5ffe0468:       03e00008        jr      ra
5ffe046c:       00000000        nop

It seems more correct, although I don't understand if the
"__get_tls_addr" function should really return the address position of
the DTP or DTP+0x8000. Here it seems gcc expect "__get_tls_addr" to
return DTP since it accesses directly to the variable without any
offset ("lw   v0,0(v0)").
Now, if we look at the compilation of the library (libc.so) that
defines the variable "b", we get something else.
Here is the code:

__thread unsigned int* b;
void puts(char* str)
{
    b++;
}

The CFLAGS are the same as previously (with "-Os"), except "-fpic" is
specified instead "-fpie". And the LDFLAGS specify "-shared" instead
of "-pie".
Here is the generated code:

5ffe03cc <puts>:
5ffe03cc:       3c1c0001        lui     gp,0x1
5ffe03d0:       279c9054        addiu   gp,gp,-28588
5ffe03d4:       0399e021        addu    gp,gp,t9
5ffe03d8:       27bdfff0        addiu   sp,sp,-16
5ffe03dc:       afbf000c        sw      ra,12(sp)
5ffe03e0:       afbc0000        sw      gp,0(sp)
5ffe03e4:       8f99802c        lw      t9,-32724(gp)
5ffe03e8:       27848030        addiu   a0,gp,-32720
5ffe03ec:       0320f809        jalr    t9
5ffe03f0:       00000000        nop
5ffe03f4:       3c030000        lui     v1,0x0
5ffe03f8:       00621821        addu    v1,v1,v0
5ffe03fc:       8c628000        lw      v0,-32768(v1)
5ffe0400:       8fbf000c        lw      ra,12(sp)
5ffe0404:       8fbc0000        lw      gp,0(sp)
5ffe0408:       24420004        addiu   v0,v0,4
5ffe040c:       ac628000        sw      v0,-32768(v1)
5ffe0410:       03e00008        jr      ra
5ffe0414:       27bd0010        addiu   sp,sp,16

This time, it seems gcc expects "__get_tls_addr" to return the
DTP+0x8000. Indeed the access to variable "b" is done with "lw
v0,-32768(v1)" and 32768==0x8000.

Well, I really could use some help on this...

Last question, is there a difference between DSO and PIE objects other
than the INTERP entry in the program header?

Thank you all,
Best regards,

Joel

Reply via email to