There is one question on SO which seems like a serious problem for atomic ops.
http://stackoverflow.com/questions/4165149/compiler-optimization-breaks- multi-threaded-code in short: shared uint cnt; void atomicInc ( ) { uint o; while ( !cas( &cnt, o, o + 1 ) ) o = cnt; } is compile with dmd -O to something like: shared uint cnt; void atomicInc ( ) { while ( !cas( &cnt, cnt, cnt + 1 ) ) { } } see the web page for details.