On Monday, 9 November 2015 at 21:02:35 UTC, Jacob Carlborg wrote:
On 2015-11-09 18:30, bitwise wrote:
The AA is not needed. The offset of the TLS var is known at
compile
time.
I was thinking instead of iterating over all loaded images.
Something that could be done without modifying the compiler.
If you look at sections_elf_shared.d you can see the signature
of
__tls_get_addr, and that it takes a pointer to the struct
tls_index or
something. *if* I understand correctly, one of the two vars in
that
struct is the index of the image, and the other is the offset
into the
imag's tls section. Not sure where/hoe that struct is
outputted though.
So you would have to figure out how to get the backend to do
the same
thing for OSX. I think the image index may have to be assigned
at load
time, but I'm not sure.
If we're going to modify the backend it's better to match the
native implementation.
Why?
I looked a bit at the implementation. For each TLS variable it
outputs two symbols (at least if the variable is initialized).
One with the same name as the variable, and one with the
variable name plus a prefix, "$tlv$init". The symbol with the
prefix contains the actual value which the variable is
initialized in the source code with.
The other symbol is a struct looking something like this:
struct TLVDescriptor
{
void* function (TLVDescriptor*) thunk;
size_t key;
size_t offset;
}
The dynamic loader will, when an image is loaded, set "thunk"
to a function implemented in the dynamic loader. "key" is set
to a key created by "pthread_key_create". It then maps the key
to the currently loading image.
I think the compiler access the variable as if it were a global
variable of type "TLVDescriptor". Then calls the thunk passing
in the variable itself.
So the following code:
int a = 3;
void foo() { auto b = a; }
Would be lowered to:
TLVDescriptor _a;
int _a$tlv$init = 3;
void foo()
{
TLVDescriptor tmp = _a;
int b = cast(int) tmp.thunk(&tmp);
}
When the compiler stores the symbol in the image it would only
need to set the offset since the dynamic loader sets the other
two fields.
Although I'm not sure how the "_a$tlv$init" symbol is used. If
the dynamic loader completely handles that or if the compiler
need to do something with that.
Our current approach is already very similar - the one for
linux/bsd, even more so than OSX. The data layout and exact
specifics differ slightly, both the approach you're describing
sounds basically the same as what we're already doing. We
allocate the TLS block and pthread key for an entire image in one
shot, instead of one var at a time, which is a difference, if I
understand correctly...but aside from that, I think the effect
is the same.
On a slightly different note, I'm looking at our implementation
right now... and a couple of things seem wrong with it.
First of all, it allocates the TLS block for each thread that
accesses a TLS var:
https://github.com/D-Programming-Language/druntime/blob/fb127f747edb211b06b35a5a5e548f03e9b750e3/src/rt/sections_osx.d#L156
But where does it ever free it!? Does this mean it causes leaks
when you create threads and access TLS vars from them? It seems
so. Also, the memory is allocated using calloc, and the block is
never added to the GC..doesn't this mean that the GC won't scan
there, and could potentially free objects that are stored there?
Bit