What we really need for x86 is more general locking. We've added a LOCKED
flag to the mem request, and the current x86 microcode uses this for locked
operations, where the semantics are that a load with the LOCKED flag (which
should "lock" the cache block) is always followed by a store with the LO
We (Wisconsin) are working on implementing atomics in Ruby. For LL/SC, we
are going to stick with the implementation from the original GEM5 that
closely mimics how LL/SC is/was handled in the M5 cache.
For single instructions atomics (fetch&add, comp&swap, etc), we are thinking
about an implement