On Sat, Jan 04, 2014 at 11:02:00AM -0800, Nicolas Noble <nico...@grumpycoder.net> wrote: > In a nutshell, what I want to say is that to POSIX - and Visual C's default > is that mode
Do you have a reference for that (for both posix and visual studio)? I didn't even know that POSIX had anything to say on volatile at all. Are you confusing this with the iso mode for volatiles (which isn't the default). In any case, no, I do not believe that visual C has anything like a posix mode, that would be great if it were true (finally, snprintf :). > volatile is merely an un-optimization request from the > programmer to the compiler. It in no means affects the internal state of > the CPU. And that was the crux of the problem here: internal CPU caches. Did you read the msdn reference I gave you? I am pretty sure you didn't. And neither my reply, which explicitly says that visual studio gives volatile acquire/release semantics, or at least, is clearly documented so. > will re-iter saying this several time over the course of that e-mail, > because it's really important in my opinion. Compiler-barrier and > CPU-barriers, while sounding very similar, aren't being issued the same > way. C's volatiles, or Visual C's _ReadWriteBarrier() are only > compiler-barriers. C11's _Atomics, or Visual C's MemoryBarrier() are > CPU-barriers. As is volatile on visual C, according to the official documentation. Again, please read it. > http://msdn.microsoft.com/en-us/library/12a04hfd%28v=vs.90%29.aspx says > that volatiles work as memory barriers, No, please read it. It explicitly says it has acquire/release semantics and "This allows volatile objects to be used for memory locks and releases in multithreaded applications." It doesn't say you need to disable posix mode for this, and it clearly says it is a microsoft-specific extension (volatile already has compiler memory barrier semantics in standard C). > which is true, but they work as compiler-memory barriers, not as > CPU-memory barriers. The page explicitly disagrees with you on that. Where does this confilcting information come from? > compiler to avoid optimizing around and to not re-order or cache reads and > writes to that variable, it won't affect the CPU's internal caches, and > reading or writing a volatile variable will not issue CPU synchronization > instructions. Again, microsoft explicitly disagrees with you, as a simple google search shows: http://blogs.msdn.com/b/oldnewthing/archive/2011/04/19/10155452.aspx If you are using Visual Studio 2005 or later, then you don't need the weird InterlockedReadAcquire function because Visual Studio 2005 and later automatically impose acquire semantics on reads from volatile locations http://msdn.microsoft.com/en-us/library/12a04hfd.aspx When the /volatile:ms compiler option is used—by default when architectures other than ARM are targeted—the compiler generates extra code to maintain ordering among references to volatile objects in addition to maintaining ordering to references to other global objects. In particular: However, I found that this isn't true on arm for visual studio 2013: http://msdn.microsoft.com/en-us/library/jj204392.aspx /volatile:ms Selects Microsoft extended volatile semantics, which add memory ordering guarantees beyond the ISO-standard C++ language. Acquire/release semantics are guaranteed on volatile accesses. However, this option also forces the compiler to generate hardware memory barriers, which might add significant overhead on ARM and other weak memory-ordering architectures. If the compiler targets any platform except ARM, this is default interpretation of volatile. > In the example, they implement a spin lock semaphore using a volatile > boolean. Thread1 is busy-waiting on the boolean variable to become true, > and Thread2 will eventually set it to true. The trick however is that it's > busy looping, meaning that it'll eventually get around the CPU caches in an > SMP environment. libev only relies on atomicity, it doesn't implement locks. > What I am saying is that volatiles are never enough to provide SMP > synchronization. In standard C, yes, but mcirosoft visual studio explicitly implements that. I didn't know that later versions of visual studio make this configurable (if you look at earlier documentation you cna see the volatile switch didn't exist back then and the behaviour defaulted to microsoft behaviour). > In the case of libev, here is what was happening. Thank you for all your explanations, but I do know what atomic memory accesses are. The problem is not that we didn't know what we were doing, the problem is that microsoft explicitly documents that volatile accesses are atomic in multi-cpu environments, and we relied on that. Apparently, visual studio 2013 makes this a configurable option (while still defaulting to acquire/release semantics). > about the fact ECB_MEMORY_FENCE* were compiler-barriers only when using > Visual C's _ReadWriteBarrier(). Again, the problem here is not that libev used the wrong memory fence, the problem here is that microsoft changed semantics in later compiler versions and broke it. In the visual c version I developed with at the time, _ReadWriteBarrier was an actual #define in the header file and compiled to a lock xchg instruction or equivalent. It is good to know that microsoft broke this in later releases, but, again, that just means that one should avoid microsoft at all cost, because they actively keep breaking things that were documented to work in earlier releases. > The end result for the actual race condition relies on the CPU caches Which is a compiler bug which we can work around at some extra cost. No Problem. > It's pretty visible here that the ECB_MEMORY_FENCE have emitted absolutely > no instruction whatsoever, The problem is that volatile is documented to do what is necessary on your platform, and ecb_memory_fence requires volatile. > and the writes and reads on pipe_write_skipped > and pipe_write_wanted are simple movs and cmps, without any memory bus lock. We don't actually need any bus lock in libev to make it work correctly. All that is needed is that memory accesses to volatile have acquire/release semantics. > have the same caches - since Intel CPUs are pretty scorched-earth-method > when it comes to cache-lines purging, a simple lock anywhere will do. And I still wonder why we deserve this lecture - a simple look at ecb.h at the plac where you patched shows that we are aware of that, and that somethign else must go on that you didn't get. In this case, we rely on the documented guarantee that volatile is atomic w.r.t. multi-core cpus for example. > So, again, I want to emphasis that volatiles aren't atomics. But they officially are. The fatc that they aren't is a bug in your compiler, as they are explicitly dicumented to be atomic. > They don't guarantee CPUs caches synchronization, They do. The fact that microsoft fails to deliver on this guarantee is telling, but they still guarantee that the cpu caches synchronise. > optimizations. Which is why C11 and C++11 introduced <stdatomic.h> and > <atomic> respectively. No, you gte it all backwards. The volatile semantics were introduced *because* C11 and C++11 didn't exist yet - microsoft volatile extensions came first, it's not the other way round. > And Visual C's _ReadWriteBarrier() is the same as a volatile compiler > hint. Since the 2008 release, it no longer issues CPU cache flushes, See, that's the crux. Microsoft broke another guarantee. > it simply is a broader volatile. If it were a broader volatile, it would work, as volatile on visual C is documented to result in atomic read and write accesses (but not read-and-write). > As I said in the preamble, sorry if I sound condescending, but I rather > wanted to make sure we were on the same page than leaving any doubt around. Yes, you sound very condescending, especially as a quick look just a few lines above or below your patch would have shown you that we are quite aware of memory locks, so all your explanations about memory races are just condescending, as even you should have been able to read the microsoft documentation, google a bit. Worse, you are obviously aware that microsoft broke barrier semantics, so why lecture us about race conditions that were introduced by microsoft in our code that didn't have race conditions before. Shouldn't you explain all this to your compiler vendor? Now to something constructive - does visual studio 2008, the version you refer to, already have the volatile:iso option, does it enable it by default (contrary to documentation), and/or are you runing on arm, where the option apparently is off by default? -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schm...@schmorp.de -=====/_/_//_/\_,_/ /_/\_\ _______________________________________________ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev