RE: OpenSSL use of DCLP may not be thread-safe on multiple processors

Brian Hurt Fri, 08 Apr 2005 12:55:26 -0700

On Fri, 8 Apr 2005, David Schwartz wrote:

        No. The C standard is not telling the compiler what to do. It is saying
what the system must do when it runs the particular source code. If the
compiler cannot generate code that makes the system as a whole comply with
the standard, then the compiler does not conform.

Yes, but the standard is only defined in terms of what is visisble from a single thread, and not in terms of what is visible from external vantage points (like other threads).

No C compiler I ever worked with issued the memory barrier/cache flush instructions needed to enforce cache behavior for volatile references. Specially, neither Visual Studio nor GCC for the x86 issues those sorts of instructions.

I haven't looked at the code in question, but my general experience has been if you're relying on some precise memory specification and exacting standards adherence, you're probably screwing up.

int a;
int c;
void foo(int b)
{
        c = b;
        a = c;
}


You will probably get code (x86 gas format) like:
        movl    8(%ebp), %eax   ; eax = b
        movl    %eax, c         ; c = b
        movl    %eax, a         ; a = c

into an assembly language sequence that loads the contents of b into a
register, and then stores it into both a and c.  The following code:

int a;
volatile int c;
void foo(int b)
{
        c = b;
        a = c;
}


This will produce code like:
        movl    8(%ebp), %eax   ; eax = b
        movl    %eax, c         ; c = b
        movl    c, %eax         ; eax = c
        movl    %eax, a

Note the reload of c. Also note the utter lack of MBAR, CFLUSH, and similiar instructions.

This is actually pretty standard behavior in the face of caches, and write combining and speculative execution and all the other tricks modern CPUs are doing. It issued the write, and then issued a seperate read to read the value back in, and the fact that the CPU short circuited this isn't the compiler's problem. You can argue to the cows come home wether this is conformant or not- but that's the behavior on the ground.

The compiler is not free to ignore anything. If the C standard specifies that the writes must occur in order, then the compiler must make the writes occur in order. Not generate assembly code that makes it look like the writes occur in order, but occur in order. The abstract machine is not about assembly language, it’s about what actually happens.

That's what the compilers do. And if the machine combines the writes- as most modern CPUs almost certainly would, the compilers will not issue extra instructions to overcome this. Especially considering that it's non-trivial to determine if the extra instructions are even needed. I mean, on the x86 you have the CD and NW flags in CR0, you have the MTRRs, plus bit 6 of the IA32_MISC_ENABLE MSR all statically controlling various types of caching.

Brian

RE: OpenSSL use of DCLP may not be thread-safe on multiple processors

Reply via email to