There is no point in saving a thread local reference to the
global instance. The `__gshared` instance is never changed once
initialized, so if we saved a thread local reference, it would
*always* be either null or the same as the `__gshared` one -
which means that if the local reference is not null, there is
no difference between returning the local and the global
references.
`hasInstance` does not need no synchronization - it would just
slow it down. Synchronization is redundant in readonly and
writeonly scenarios - and this is a readonly scenario. A single
read is atomic with or without a synchronization.
At any rate, using my implementation was broekn - I forgot to
set the thread local boolean instantiation indicator to
true(which would mean there will always be a lock!). I fixed
it. Thanks for pointing that out!
With regard to using a boolean instead of storing the instance
thread locally - you're still reading from a mutable __gshared
variable with no synchronisation on the reader's side, and that
is always a bug. It may work in most cases but due to instruction
reordering and differences between architectures there's no
guarantee of that.
It's also less efficient as you have to read both the
thread-local boolean and the __gshared instance. Since the
thread-local boolean is likely going to use a word anyway you may
as well store the instance in there instead.
Single reads are NOT atomic. On x86 word-aligned reads *happen*
to be atomic, and even that is not guaranteed on other
architectures. The main advantage of the low-lock singleton idea
is that it is completely independent of architecture (there are
more efficient ways if the architecture is known).
With respect to "hasInstance", what is a possible use-case where
synchronisation is not required?