A few posts deep in the discussion on std.parallelism have prompted me to double-check an assumption that I made previously. Is writing to adjacent but non-overlapping memory addresses concurrently from different threads safe on all hardware we care about supporting?
I know this isn't safe on some DS9K-like architectures that we don't care about, like old DEC Alphas. This is because the hardware doesn't allow addressing of single bytes. I'm also aware of the performance implications of false sharing, but this is not of concern because, for the cases where adjacent memory addresses are written to concurrently in std.parallelism or its examples, these are only a tiny fraction of writes and would not have a significant impact on performance. I'm also aware that the compiler could in theory generate instructions to perform writes at a higher granularity than what's specified by the source code, but I imagine this is a purely theoretical concern, as I can't see any reason why it would in practice. IMHO if this is already the way it works in practice, it should be formally specified by D's memory model.