On Sat, Jan 18, 2014 at 02:01:05AM -0800, Paul E. McKenney wrote: > OK, I will bite... Aside from fine-grained code timing, what code could > you write to tell the difference between a real one-byte store and an > RMW emulating that store?
Why isn't fine-grained code timing an issue? I'm sure Alpha people will love it when their machine magically keels over every so often. Suppose we have two bytes in a word that get concurrent updates: union { struct { u8 a; u8 b; }; int word; } ponies = { .word = 0, }; then two threads concurrently do: CPU0: CPU1: ponies.a = 5 ponies.b = 10 At which point you'd expect: a == 5 && b == 10 However, with a rmw you could end up like: load r, ponies.word load r, ponies.word and r, ~0xFF or r, 5 store ponies.word, r and r, ~0xFF00 or r, 10 << 8 store ponies.word, r which gives: a == 0 && b == 10 The same can be had on a single CPU if you make the second RMW an interrupt. In fact, we recently had such a RMW issue on PPC64 although from a slightly different angle, but we managed to hit it quite consistently. See commit ba1f14fbe7096. The thing is, if we allow the above RMW 'atomic' store, we have to be _very_ careful that there cannot be such overlapping stores, otherwise things will go BOOM! However, if we already have to make sure there's no overlapping stores, we might as well write a wide store and not allow the narrow stores to begin with, to force people to think about the issue. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/