On 10/21/07, Tomash Brechko <[EMAIL PROTECTED]> wrote: > Hello, > > I have a question regarding the thread-safeness of a particular GCC > optimization. I'm sorry if this was already discussed on the list, if > so please provide me with the reference to the previous discussion. > > Consider this piece of code: > > extern int v; > > void > f(int set_v) > { > if (set_v) > v = 1; > } > > If f() is called concurrently from several threads, then call to f(1) > should be protected by the mutex. But do we have to acquire the mutex > for f(0) calls? I'd say no, why, there's no access to global v in > that case. But GCC 3.3.4--4.3.0 on i686 with -01 generates the > following: > > f: > pushl %ebp > movl %esp, %ebp > cmpl $0, 8(%ebp) > movl $1, %eax > cmove v, %eax ; load (maybe) > movl %eax, v ; store (always) > popl %ebp > ret > > Note the last unconditional store to v. Now, if some thread would > modify v between our load and store (acquiring the mutex first), then > we will overwrite the new value with the old one (and would do that in > a thread-unsafe manner, not acquiring the mutex). > > So, do the calls to f(0) require the mutex, or it's a GCC bug? ... > So, could someone explain me why this GCC optimization is valid, and, > if so, where lies the boundary below which I may safely assume GCC > won't try to store to objects that aren't stored to explicitly during > particular execution path? Or maybe the named bug report is valid > after all?
Hello Tomash, I'm not an expert in the C89/C99 standards, but I have written a Ph.D. on the subject of memory models. What I learned during writing that Ph.D. is the following: - If you want to know which optimizations are valid and which ones are not, you have to look at the semantics defined in the language standard. - Every language standard document defines what the result is of executing a sequential program. The definition of the behavior of a multithreaded program written in a certain programming language is called the memory model of that programming language. - The memory model of C and C++ is still under discussion as has already been pointed out on this mailing list. - Although the memory model for C and C++ is still under discussion, there is a definition for the behavior of multithreaded C and C++ programs. The following is required by the ANSI/ISO C89 standard (from paragraph 5.1.2.3, Program Execution): Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.) In annex C it is explained that a.o. the call to a function (after argument evaluation) is a sequence point. See also http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n843.pdf - The above paragraph does not impose any limitation for the compiler with regard to optimizations on non-volatile variables. Or: the generated code shown in your mail is allowed by the above paragraph. - The above paragraph has also the following implications for volatile variables: * There exists a total order for all accesses to all volatile variables. * It is the responsibility of the compiler to ensure cache coherency for volatile variables. If memory barrier instructions are needed to ensure cache coherency on the architecture for which the compiler is generating code for, then it is the responsibility of the compiler to generate these instructions for volatile variables. This fact is often overlooked. * The compiler must generate code such that exactly one store statement is executed for each assignment to a volatile variable. Prefetching volatile variables is allowed as long as it does not violate paragraph 5.1.2.3 from the language definition. * As known the compiler may reorder function calls and assignments to non-volatile variables if the compiler can prove that the called function won't modify that variable. This becomes problematic if the variable is modified by more than one thread and the called function is a synchronization function, e.g. pthread_mutex_lock(). This kind of reordering is highly undesirable. This is why any variable that is shared over threads has to be declared volatile, even when using explicit locking calls. I hope the above brings more clarity in this discussion. Bart Van Assche.