dsimcha wrote:
== Quote from Pelle Månsson (pelle.mans...@gmail.com)'s article
dsimcha wrote:
Has D's builtin TLS been optimized in the past 6 months to year? I had
benchmarked it awhile back when optimizing some code that I wrote and
discovered it was significantly slower than regular globals (the kind that are
now __gshared). Now, at least on Windows, it seems that there is no
discernible difference and if anything, TLS is slightly faster than __gshared.
What's changed?
I was under the impression that TLS should be faster due to absence of
synchronization.
__gshared == old-skool cowboy sharing, i.e. plain old unsynchronized globals.
Without getting into the details of my specific case, the reason I'm interested
in
this is that I have some code that I want to be as fast as possible in both
single- and multithreaded environments. Right now, it has a hack that checks
thread_needLock() and uses plain old globals for everything as long as the
program
is single-threaded because that seemed faster than TLS lookups a while ago.
However, running the same benchmark again shows otherwise.
Nothing has changed. What I would do is to look at the assembler output
and verify that the TLS globals really are TLS, and the ones that are
not are really not.