On Mon, 25 Feb 2019, Andrew Cagney wrote: > On Mon, 25 Feb 2019 at 11:35, Eduardo Horvath <e...@netbsd.org> wrote: > > > > On Sat, 23 Feb 2019, Jason Thorpe wrote: > > > > > int ufetch_8(const uint8_t *uaddr, uint8_t *valp); > > > int ufetch_16(const uint16_t *uaddr, uint16_t *valp); > > > int ufetch_32(const uint32_t *uaddr, uint32_t *valp); > > > #ifdef _LP64 > > > int ufetch_64(const uint64_t *uaddr, uint64_t *valp); > > > #endif > > > > etc. > > > > I'd prefer to return the fetched value and have an out of band error > > check. > > > > With this API the routine does a userland load, then a store into kernel > > memory. Then the kernel needs to reload that value. On modern CPUs > > memory accesses tend to be sloooooow. > > So the out-of-band error check becomes the slooooow memory write?
Even cached accesses are slower than register accesses. And the compiler is limited in what it can do to reorder the instruction stream while maintaining C language semantics. > I don't see it as a problem as the data would presumably be written to > the write-back cached's stack (besides, if the function is short, LTO > will eliminate it). The compiler can eliminate the function but need to keep the defined C language synchronization points so the memory accesses limit the ability of the compiler to properly optimize the code. Plus, there are a number of applications where you may want to do a series of userland fetches in a row and operate on them. In this case it would be much easier to do the series and only check for success when you're done. And all the existing code expects the value to be returned by the function not in one of the parameters, so existing code would require less rewriting. I'd do something like: uint64_t ufetch_64(const uint64_t *uaddr, int *errp); where *errp needs to be initialized to zero and is set on fault so you can do: int err = 0; long hisflags = ufetch_64(flag1p, &err) | ufetch_64(flag2p, &err); if (err) return EFAULT; do_something(hisflags); Eduardo