On Sat, Jan 29, 2011 at 2:54 AM, Matthew Dillon <[email protected]> wrote: > > :Hi all, > : > :i386 version of cpu_sfence(), it is just asm volatile ("" :::"memory") > : > :According to the instruction set, sfence should also ensures that the > :"global visibility" (i.e. empty CPU store buffer) of the stores before > :sfence. > :So should we do the same as cpu_mfence(), i.e. use a locked memory access? > : > :Best Regards, > :sephe > > cpu_sfence() is basically a NOP, because x86 cpus already order > writes for global visibility. The volatile ..."memory" macro is
The document only indicates that writes are ordered on x86, but global visibility is not: http://support.amd.com/us/Processor_TechDocs/24593.pdf The second point on page 166 I think it suggests that: processor 0 processor 1 store A <--- 1 : : later :..........> load r1 A r1 still could be 0, since the A is still in the store buffer, while: processor 0 processor 1 store A <--- 1 sfence : : later :..........> load r1 A r1 could must be 1 Well, I could be wrong on this. > roughly equivalent to cpu_ccfence() ... prevent the compiler itself > from trying to optimize or reorder actual instructions around that > point in the code. Best Regards, sephe -- Tomorrow Will Never Die
