Dear GCC-list and Daniel, Lately, I continued working on TLS for mips and a few things bother me.
Firstly, it seems the "-ftls-model" option for gcc is not completely respected when compiling Position-Independent Executable ("-fpie"). Here is a sample of code for "app.c": __thread int a = 0; extern __thread int b; By default, the chosen tls model is "initial-exec", which causes one relocation for "b" (R_MIPS_TLS_TPREL32). If I specify "local-exec", the behavior is as expected, I get no relocation at all. But whenever I choose a dynamic model ("local-dynamic" or "global-dynamic"), I still get a "R_MIPS_TLS_TPREL32" which is not a dynamic relocation. Why this? Therefore, I have to explicitely associate an attribute to the variable to get the expected behavior, such as: extern __thread int b __attribute__((tls_model("local-dynamic"))); Secondly, I observe a very strange calculation of GCC. Here is the code: extern __thread int b __attribute__((tls_model("local-dynamic"))); int main() { b++; } I compile this code with the following CFLAGS: CFLAGS=-nostdinc -nostdlib -fno-builtin \ -Wall \ -fomit-frame-pointer \ -mips2 -EL -mno-branch-likely \ -mabicalls \ -G0 \ -ftls-model="local-dynamic" \ -fpie -Os and following LDFLAGS: -pie -app.o -o app.x -l:libc.so -L. (fyi, the "libc.so" defines the variable "b") And I get the following generated code (fyi, the jalr jumps to __get_tls_addr): 5ffe03fc <main>: 5ffe03fc: 3c1c0001 lui gp,0x1 5ffe0400: 279c9054 addiu gp,gp,-28588 5ffe0404: 0399e021 addu gp,gp,t9 5ffe0408: 27bdfff0 addiu sp,sp,-16 5ffe040c: afbf000c sw ra,12(sp) 5ffe0410: afbc0000 sw gp,0(sp) 5ffe0414: 8f99802c lw t9,-32724(gp) 5ffe0418: 27848030 addiu a0,gp,-32720 5ffe041c: 0320f809 jalr t9 5ffe0420: 00000000 nop 5ffe0424: 3c034003 lui v1,0x4003 5ffe0428: 00621821 addu v1,v1,v0 5ffe042c: 8c625760 lw v0,22368(v1) 5ffe0430: 8fbf000c lw ra,12(sp) 5ffe0434: 8fbc0000 lw gp,0(sp) 5ffe0438: 24420001 addiu v0,v0,1 5ffe043c: ac625760 sw v0,22368(v1) 5ffe0440: 03e00008 jr ra 5ffe0444: 27bd0010 addiu sp,sp,16 The suspicious instruction is "lui v1,0x4003". I don't understand where does "0x4003" comes from... Now, if I drop the size optimization, I get the following code: 5ffe040c <main>: 5ffe040c: 3c1c0001 lui gp,0x1 5ffe0410: 279c9064 addiu gp,gp,-28572 5ffe0414: 0399e021 addu gp,gp,t9 5ffe0418: 27bdffe8 addiu sp,sp,-24 5ffe041c: afbf0014 sw ra,20(sp) 5ffe0420: afb00010 sw s0,16(sp) 5ffe0424: afbc0000 sw gp,0(sp) 5ffe0428: 8f99802c lw t9,-32724(gp) 5ffe042c: 27848030 addiu a0,gp,-32720 5ffe0430: 0320f809 jalr t9 5ffe0434: 00000000 nop 5ffe0438: 8fbc0000 lw gp,0(sp) 5ffe043c: 8c420000 lw v0,0(v0) 5ffe0440: 24500001 addiu s0,v0,1 5ffe0444: 8f99802c lw t9,-32724(gp) 5ffe0448: 27848030 addiu a0,gp,-32720 5ffe044c: 0320f809 jalr t9 5ffe0450: 00000000 nop 5ffe0454: 8fbc0000 lw gp,0(sp) 5ffe0458: ac500000 sw s0,0(v0) 5ffe045c: 8fbf0014 lw ra,20(sp) 5ffe0460: 8fb00010 lw s0,16(sp) 5ffe0464: 27bd0018 addiu sp,sp,24 5ffe0468: 03e00008 jr ra 5ffe046c: 00000000 nop It seems more correct, although I don't understand if the "__get_tls_addr" function should really return the address position of the DTP or DTP+0x8000. Here it seems gcc expect "__get_tls_addr" to return DTP since it accesses directly to the variable without any offset ("lw v0,0(v0)"). Now, if we look at the compilation of the library (libc.so) that defines the variable "b", we get something else. Here is the code: __thread unsigned int* b; void puts(char* str) { b++; } The CFLAGS are the same as previously (with "-Os"), except "-fpic" is specified instead "-fpie". And the LDFLAGS specify "-shared" instead of "-pie". Here is the generated code: 5ffe03cc <puts>: 5ffe03cc: 3c1c0001 lui gp,0x1 5ffe03d0: 279c9054 addiu gp,gp,-28588 5ffe03d4: 0399e021 addu gp,gp,t9 5ffe03d8: 27bdfff0 addiu sp,sp,-16 5ffe03dc: afbf000c sw ra,12(sp) 5ffe03e0: afbc0000 sw gp,0(sp) 5ffe03e4: 8f99802c lw t9,-32724(gp) 5ffe03e8: 27848030 addiu a0,gp,-32720 5ffe03ec: 0320f809 jalr t9 5ffe03f0: 00000000 nop 5ffe03f4: 3c030000 lui v1,0x0 5ffe03f8: 00621821 addu v1,v1,v0 5ffe03fc: 8c628000 lw v0,-32768(v1) 5ffe0400: 8fbf000c lw ra,12(sp) 5ffe0404: 8fbc0000 lw gp,0(sp) 5ffe0408: 24420004 addiu v0,v0,4 5ffe040c: ac628000 sw v0,-32768(v1) 5ffe0410: 03e00008 jr ra 5ffe0414: 27bd0010 addiu sp,sp,16 This time, it seems gcc expects "__get_tls_addr" to return the DTP+0x8000. Indeed the access to variable "b" is done with "lw v0,-32768(v1)" and 32768==0x8000. Well, I really could use some help on this... Last question, is there a difference between DSO and PIE objects other than the INTERP entry in the program header? Thank you all, Best regards, Joel