Paul Rubin schrieb: > I dunno about x86 hardware signals but these instructions do > read-modify-write operaitons. That means there has to be enough > interlocking to prevent two cpu's from updating the same memory > location simultaneously, which means the CPU's have to communicate. > See <http://en.wikipedia.org/wiki/MESI_protocol> (I think this is > how the current x86's do it):
Ah, but in the case where the lock# signal is used, it's known that the data is not in the cache of the CPU performing the lock operation; I believe it is also known that the data is not in the cache of any other CPU. So the CPU performing the LOCK INC sequence just has to perform two memory cycles. No cache coherency protocol runs in that case. But even when caching is involved, I don't see why there should be more than three memory cycles. The MESI "protocol" really consists only of two pins: HIT# and HITM#; the actual implementation is just in keeping the state for each cache line, and in snooping. There CPU's don't really "communicate". Instead, if one processor tries to fill a cache line, the others snoop the read, and assert either HIT# (when they have not modified their cache lines) or HITM# (when they do have modified their cache lines). Assertions of these signals is also immediate. The worst case would be that one processor performs a LOCK INC, and another processor has the modified value in its cache line. So it needs to first flush the cache line, before the other processor can modify the memory. If the memory is not cached yet in another processor, the MESI protocol does not involve additional penalties. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list