On Wed, Sep 30, 2020 at 03:59:15PM -0400, Arvind Sankar wrote: > On Wed, Sep 30, 2020 at 12:14:03PM -0700, Nick Desaulniers wrote: > > On Wed, Sep 30, 2020 at 10:13 AM Peter Zijlstra <pet...@infradead.org> > > wrote: > > > > > > On Wed, Sep 30, 2020 at 11:10:36AM -0500, Segher Boessenkool wrote: > > > > > > > Since this variable is a local register asm, on entry to the asm the > > > > compiler guarantees that the value lives in the assigned register (the > > > > "r8" hardware register in this case). This all works completely fine. > > > > This is the only guaranteed behaviour for local register asm (well, > > > > together with analogous behaviour for outputs). > > How strict is the guarantee? This is an inline function -- could the > compiler decide to reorder some other code in between the r8 assignment > and the asm statement when it gets inlined? > > > > > > > Right, that's what they're trying to achieve. The hypervisor calling > > > convention needs that variable in %r8 (which is somewhat unfortunate). > > > > > > AFAIK this is the first such use in the kernel, but at least the gcc-4.9 > > > (our oldest supported version) claims to support this. > > > > > > So now we need to know if clang will actually do this too.. > > > > Does clang support register local storage? Let's use godbolt.org to find > > out: > > https://godbolt.org/z/YM45W5 > > Looks like yes. You can even check different GCC versions via the > > dropdown in the top right. > > > > The -ffixed-* flags are less well supported in Clang; they need to be > > reimplemented on a per-backend basis. aarch64 is relatively well > > supported, but other arches not so much IME. > > > > Do we need register local storage here? > > > > static inline long bar(unsigned long hcall_id) > > { > > long result; > > asm volatile("movl %1, %%r8d\n\t" > > "vmcall\n\t" > > : "=a" (result) > > : "ir" (hcall_id) > > : ); > > return result; > > } > > This seems more robust, though you probably need an r8 clobber in there? > Is hcall_id actually just 32 bits or can it be >=2^32?
Also, I think you need memory clobbers for all of these in either case, no?