> > > > Given we need both, why not actually defined an API that gives you > > > > this? > > > > > > Because, I do not want to define APIs, I want to reuse an existing one. > > > > Except that, say you said later in your email, no API exists for doing > > atomic accesses, so you need different code anyway. > > Did I say that? I take it back :) Real devices simply can not do things > like atomic increment etc. So all we really need is storing two bytes > in memory
No. virtio requires an atomic store of the whole 16-bit value. > > > E.g. what is stw_barrier? atomic read followed by a barrier? read > > > preceded by a barrier? and what kind of barrier? IMO > > > > stw_barrier is an ordered atomic store. > > _barrier implies atomic? Okay ... still, ordered with respect to what? > Preceding stores? Following stores? loads? All of the above? All of the above. > We can define such an API, but it's easier to misuse than a familiar and > documented one IMO. You're already using nonstandard APIs in the form of stw_phys. Even worse, you've made incorrect assumptions about the semantics of stw_phys. > > >With KVM, even these partial barriers add small but measureable overhead > > >(about 2%). > > > > Now are you measuring that overhead? How much additional overhead does a > > full barrier incur? > > With my patch datapath has a single read barrier, and a write barrier, > but write barrier is a nop. If you make write barrier a read barrier, > and add another barrier before operations, I expect about 4 times > the number, so that will be what 8%? I doubt it. I'd expect once you have one barrier the extras don't make a whole lot of difference. TBH I'm surprised they're significant at all except in maybe in very specific micro-benchmarks. Paul