On Fri, Jun 21, 2019 at 12:09 PM Maciej W. Rozycki <ma...@linux-mips.org> wrote: > > On Fri, 21 Jun 2019, Arnd Bergmann wrote: > > > > The use of 64-bit operations to access option's packet memory, which is > > > true SRAM, i.e. no side effects, is to improve throughput only and there's > > > no need for atomicity here nor also any kind of barriers, except at the > > > conclusion. Splitting 64-bit accesses into 32-bit halves in software > > > would not be a functional error here. > > > > The other property of packet memory and similar things is that you > > basically want memcpy()-behavior with no byteswaps. This is one > > of the few cases in which __raw_readq() is actually the right accessor > > in (mostly) portable code. > > Correct, but we're missing an `__raw_readq_relaxed', etc. interface and > having additional barriers applied on every access would hit performance > very badly;
How so? __raw_readq() by definition has the least barriers of all, you can't make it more relaxed than it already is. > in fact even the barriers `*_relaxed' accessors imply would > best be removed in this use (which is why defza.c uses `readw_o' vs > `readw_u', etc. internally), but after all the struggles over the years > for weakly ordered internal APIs x86 people are so averse to I'm not sure > if I want to start another one. We can get away with `readq_relaxed' in > this use though as all the systems this device can be used with are > little-endian as is TURBOchannel, so no byte-swapping will ever actually > occur. I still don't see any downside of using __raw_readq() here, while the upsides are: - makes the driver portable to big-endian kernels (even though we don't care) - avoids all barriers - fixes the build regression. Arnd