Jacob Carlborg Wrote:
> 
> Thread.getThis() calls pthread_getspecific which is just three 
> instructions on Mac OS X, so I guess that's not why it's so slow. The 
> only thing I can think of is first moving the if statement into the 
> assert and then trying to inline as much of the function calls.

Swapping the assert and the executable code would save you a jump, but inlining 
the call to ___tls_get_addr would be be a bit trickier.  We'd probably have to 
expose Thread.sm_this as an extern (C) symbol, move the function to object.d 
and explicitly do the pthread_getspecific call there.  If that would be enough 
for the compiler to inline the calls then it shouldn't be too hard to test, but 
I'm worried that the call generation may happen too late.  I guess it wouldn't 
be too hard to figure out from looking at the asm output though (PIC code 
weirdness notwithstanding).

Reply via email to