> Date: Fri, 11 Feb 2022 09:29:39 -0800 > From: Jason Thorpe <thor...@me.com> > > > On Feb 11, 2022, at 9:03 AM, Jason Thorpe <thor...@me.com> wrote: > > > > Right, v7 didn't have any barrier instructions, and v8 only had > > STBAR (a stores-before-stores barrier), IIRC. I did not realize > > we are using RMO for 64-bit userland, tho. > > So, given the existing definition of membar_enter() in the man page, > sparc64's userland membar_sync() really ought to be "membar > #StoreStore | #StoreLoad", and with Taylor's newly-proposed > semantics it would need to be "membar #LoadLoad | #LoadStore".
(Just to clarify: membar_sync is, and always has been, intended to be load/store-before-load/store -- i.e., a full sequential consistency barrier. I'm only suggesting changing the membar_enter man page.) > The current "membar #LoadLoad" is *only* maybe-OK if you're in TSO > mode, and even then I think it's dicey. I reviewed the SPARCv9 Reference Manual, Appendices D.4 `Specification of Relaxed Memory Order', D.5 `Specification of Partial Store Order', and D.6 `Specification of Total Store Order', pp. 294--296 (https://web.archive.org/web/20201127100244/http://www.sparc.org/standards/SPARCV9.pdf). - With PSO, every load is a load-acquire -- that is, it's as if every load is followed by a load-before-load/store barrier. - With TSO, every load is a load-acquire _and_ there is a total order on all stores -- that is, it's as if every store is followed by a store-before-store barrier -- _and_ every atomic r/m/w implies an acquire/release barrier (load-before-load/store and load/store-before-store, but not store-before-load). So with TSO, the only ordering for which you ever need any explicit memory barrier is store-before-load -- just like x86. And...just like x86, the current sparc64 membar_enter fails to order store-before-load! So x86, powerpc, and sparc64 all implement what I suggest membar_enter _should_ be (and what all current users use it for!), and _fail_ to implement what membar_enter is documented to be in the man page. And of all the membar_enter definitions, only the riscv one fails to guarantee the load-before-load/store ordering I suggest it should be documented to have. (sparc64 membar_sync also fails to order store-before-load, and that's clearly just an implementation bug because membar_sync has always been unequivocally intended and used as a full load/store-before-load/store sequential consistency barrier. Easy fix: change membar_sync to do `membar #StoreLoad' for kernel running in TSO. If userland is really running in RMO, then all the membars need to be fixed on sparc64. It looks like sparc (i.e., `sparc32'/`sparcv8') should get the same treatment.)