Hi! Since there seems to be some interest in ghc/haskell on #dragonflybsd, I would like to write down, what my (uneducated) guess is about the main problem porting ghc to dfly.
On dfly we have "__thread int errno", "#define errno (*__error())", and "static __inline int *__error(void) { return (&errno); }" I believe, that ghc generates correct code for this thread local (tls) access (uses the C comiler), but the ghc linker/load (rts/linker.c) cannot handle the resulting relocation types: #define R_386_TLS_IE 15 /* Absolute address of GOT for -ve static TLS */ #define R_X86_64_GOTTPOFF 22 /* PC relative offset to IE GOT entry */ To get an idea, what the ghc linker should do, I did this: $ cat >main.c #include <errno.h> int main() { return errno; } $ gcc -c -O main.c $ gcc -o main main.o $ objdump -d main.o main.o: file format elf32-i386 Disassembly of section .text: 00000000 <main>: 0: 8d 4c 24 04 lea 0x4(%esp),%ecx 4: 83 e4 f0 and $0xfffffff0,%esp 7: ff 71 fc pushl 0xfffffffc(%ecx) a: 55 push %ebp b: 89 e5 mov %esp,%ebp d: 51 push %ecx e: 65 a1 00 00 00 00 mov %gs:0x0,%eax 14: 8b 15 00 00 00 00 mov 0x0,%edx 1a: 8b 04 10 mov (%eax,%edx,1),%eax 1d: 59 pop %ecx 1e: c9 leave 1f: 8d 61 fc lea 0xfffffffc(%ecx),%esp 22: c3 ret $ objdump -d main ... 08048548 <main>: 8048548: 8d 4c 24 04 lea 0x4(%esp),%ecx 804854c: 83 e4 f0 and $0xfffffff0,%esp 804854f: ff 71 fc pushl 0xfffffffc(%ecx) 8048552: 55 push %ebp 8048553: 89 e5 mov %esp,%ebp 8048555: 51 push %ecx 8048556: 65 a1 00 00 00 00 mov %gs:0x0,%eax 804855c: 8b 15 6c 96 04 08 mov 0x804966c,%edx -----------------------------------------------^^^^^^^^^ 8048562: 8b 04 10 mov (%eax,%edx,1),%eax 8048565: 59 pop %ecx 8048566: c9 leave 8048567: 8d 61 fc lea 0xfffffffc(%ecx),%esp 804856a: c3 ret 804856b: 90 nop ... $ objdump -h main main: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn ... 17 .got 00000004 0804966c 0804966c 0000066c 2**2 ----------------------------^^^^^^^^^^^^^^^^^^ CONTENTS, ALLOC, LOAD, DATA ... The same on x86_64: $ objdump -d main.o main.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <main>: 0: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 7 <main+0x7> 7: 64 48 8b 14 25 00 00 mov %fs:0x0,%rdx e: 00 00 10: 8b 04 02 mov (%rdx,%rax,1),%eax 13: c3 retq $ objdump -d main ... 40060c: 48 8b 05 8d 02 20 00 mov 0x20028d(%rip),%rax # 6008a0 <_DYNAMIC+0x170> -----------------------------------------------^^^^^^^^ 400613: 64 48 8b 14 25 00 00 mov %fs:0x0,%rdx --^^^^^^ 40061a: 00 00 40061c: 8b 04 02 mov (%rdx,%rax,1),%eax 40061f: c3 retq ... $ objdump -h main main: file format elf64-x86-64 Sections: Idx Name Size VMA LMA File off Algn ... 18 .got 00000008 00000000006008a0 00000000006008a0 000008a0 2**3 ----------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ CONTENTS, ALLOC, LOAD, DATA ... $ python2.6 -c 'print hex(0x6008a0 - 0x20028d)' 0x400613 This gives me the idea, that the ghc linker should insert a pointer to the .got segment on i386 and insert the 32bit offset to the .got segment on x86_64. So the next point on my todo list for ghc is, to recompile ghc without my errno access wrapper, take the .got value from the resulting binary and recompile ghc again with the linker inserting this value, check that the .got value hasn't changed, and see what happens. If this goes well, the next steps would be to (a) find out, how to calculate the .got value inside the linker, and (b) think about an experiment, that could show me what has to be done, when there is more than one tls variable. Hmm, maybe I should try also from the opposite side: How important is this "static __inline" on the __error function? Are there any situations, where one would notice a performance impact? -- Goetz Isenmann -- Vorstand/Board of Management: Dr. Bernd Finkbeiner, Michael Heinrichs, Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196