On Fri, Jul 28, 2017 at 05:38:52PM -0700, Matthias Kaehlcke wrote:
> El Thu, Jul 20, 2017 at 03:56:52PM -0500 Josh Poimboeuf ha dit:
> 
> > On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote:
> > > FWIW bellow is my understanding of what's going on.
> > > 
> > > It seems clang treats local named register almost the same as ordinary
> > > local variables.
> > > The only difference is that before reading the register variable clang
> > > puts variable's value into the specified register.
> > > 
> > > So clang just assigns stack slot for the variable __sp where it's
> > > going to keep variable's value.
> > > But since __sp is unitialized (we haven't assign anything to it), the
> > > value of the __sp is some garbage from stack.
> > > inline asm specifies __sp as input, so clang assumes that it have to
> > > load __sp into 'rsp' because inline asm is going to use
> > > it. And it just loads garbage from stack into 'rsp'
> > > 
> > > In fact, such behavior (I mean storing the value on stack and loading
> > > into reg before the use) is very useful.
> > > Clang's behavior allows to keep the value assigned to the
> > > call-clobbered register across the function calls.
> > > 
> > > Unlike clang, gcc assigns value to the register right away and doesn't
> > > store the value anywhere else. So if the reg is
> > > call clobbered register you have to be absolutely sure that there is
> > > no subsequent function call that might clobber the register.
> > > 
> > > E.g. see some real examples
> > > https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm:
> > > Fix improper register assignment").
> > > These bugs shouldn't happen with clang.
> > > 
> > > But the global named register works slightly differently in clang. For
> > > the global, the value is just the value of the register itself,
> > > whatever it is. Read/write from global named register is just like
> > > direct read/write to the register
> > 
> > Thanks, that clears up a lot of the confusion for me.
> > 
> > Still, unfortunately, I don't think that's going to work for GCC.
> > Changing the '__sp' register variable to global in the header file
> > causes it to make a *bunch* of changes across the kernel, even in
> > functions which don't do inline asm.  It seems to be disabling some
> > optimizations across the board.
> > 
> > I do have another idea, which is to replace all uses of
> > 
> >   asm(" ... call foo ... " : outputs : inputs : clobbers);
> > 
> > with a new ASM_CALL macro:
> > 
> >   ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers);
> > 
> > Then the compiler differences can be abstracted out, with GCC adding
> > "sp" as an output constraint and clang doing nothing (for now).
> 
> The idea sounds interesting, however I see two issues with ASM_CALL():
> 
> Inline assembly expects the individual elements of outputs, inputs and
> clobbers to be comma separated, and so does the macro for it's
> parameters.

I think this isn't a problem, see the (untested and unfinished) patch
below for an idea of how the arguments can be handled.

> The assembler template can refer to the position of output and input
> operands, adding "sp" as output changes the positions of the inputs
> wrt to the clang version.

True, though I think we could convert all ASM_CALL users to use named
operands instead of the (bug-prone) numbered ones.  It would be an
improvement regardless.

> Not sure how to move forward from here. Not even using an ugly #ifdef
> seems to be a halfway reasonable solution, since get_user() isn't the
> only place using this construct and #ifdefs would result in highly
> redundant macros in multiple places.

I still think ASM_CALL might work.  The below patch is what I came up
with last week before I got pulled into some other work.  I'll be out on
vacation next week but I hope to finish the patch after that.


diff --git a/arch/x86/include/asm/alternative.h 
b/arch/x86/include/asm/alternative.h
index 1b020381ab38..5da4bcbeebab 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -216,14 +216,14 @@ static inline int alternatives_text_reserved(void *start, 
void *end)
  * Otherwise, old function is used.
  */
 #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, feature2,   \
-                          output, input...)                                  \
+                          output, input, clobbers...)                        \
 {                                                                            \
-       register void *__sp asm(_ASM_SP);                                     \
-       asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
-               "call %P[new2]", feature2)                                    \
-               : output, "+r" (__sp)                                         \
-               : [old] "i" (oldfunc), [new1] "i" (newfunc1),                 \
-                 [new2] "i" (newfunc2), ## input);                           \
+       ASM_CALL(ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,     \
+                              "call %P[new2]", feature2),                    \
+                ARGS(output),                                                \
+                ARGS([old] "i" (oldfunc), [new1] "i" (newfunc1),             \
+                     [new2] "i" (newfunc2), ARGS(input)),                    \
+                ## clobbers);                                                \
 }
 
 /*
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index b4a0d43248cf..21adcc8e739f 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -45,8 +45,8 @@ static inline void clear_page(void *page)
                           clear_page_rep, X86_FEATURE_REP_GOOD,
                           clear_page_erms, X86_FEATURE_ERMS,
                           "=D" (page),
-                          "0" (page)
-                          : "memory", "rax", "rcx");
+                          "0" (page),
+                          ARGS("memory", "rax", "rcx"));
 }
 
 void copy_page(void *to, void *from);
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index cb976bab6299..2a585b799a3e 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -471,8 +471,7 @@ int paravirt_disable_iospace(void);
  */
 #ifdef CONFIG_X86_32
 #define PVOP_VCALL_ARGS                                                        
\
-       unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;      \
-       register void *__sp asm("esp")
+       unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;
 #define PVOP_CALL_ARGS                 PVOP_VCALL_ARGS
 
 #define PVOP_CALL_ARG1(x)              "a" ((unsigned long)(x))
@@ -492,8 +491,7 @@ int paravirt_disable_iospace(void);
 /* [re]ax isn't an arg, but the return val */
 #define PVOP_VCALL_ARGS                                                \
        unsigned long __edi = __edi, __esi = __esi,             \
-               __edx = __edx, __ecx = __ecx, __eax = __eax;    \
-       register void *__sp asm("rsp")
+               __edx = __edx, __ecx = __ecx, __eax = __eax;
 #define PVOP_CALL_ARGS         PVOP_VCALL_ARGS
 
 #define PVOP_CALL_ARG1(x)              "D" ((unsigned long)(x))
@@ -541,24 +539,24 @@ int paravirt_disable_iospace(void);
                /* This is 32-bit specific, but is okay in 64-bit */    \
                /* since this condition will never hold */              \
                if (sizeof(rettype) > sizeof(unsigned long)) {          \
-                       asm volatile(pre                                \
-                                    paravirt_alt(PARAVIRT_CALL)        \
-                                    post                               \
-                                    : call_clbr, "+r" (__sp)           \
-                                    : paravirt_type(op),               \
-                                      paravirt_clobber(clbr),          \
-                                      ##__VA_ARGS__                    \
-                                    : "memory", "cc" extra_clbr);      \
+                       ASM_CALL(pre                                    \
+                                paravirt_alt(PARAVIRT_CALL)            \
+                                post,                                  \
+                                ARGS(call_clbr),                       \
+                                ARGS(paravirt_type(op),                \
+                                     paravirt_clobber(clbr),           \
+                                     ##__VA_ARGS__),                   \
+                                ARGS("memory", "cc" extra_clbr));      \
                        __ret = (rettype)((((u64)__edx) << 32) | __eax); \
                } else {                                                \
-                       asm volatile(pre                                \
-                                    paravirt_alt(PARAVIRT_CALL)        \
-                                    post                               \
-                                    : call_clbr, "+r" (__sp)           \
-                                    : paravirt_type(op),               \
-                                      paravirt_clobber(clbr),          \
-                                      ##__VA_ARGS__                    \
-                                    : "memory", "cc" extra_clbr);      \
+                       ASM_CALL(pre                                    \
+                                paravirt_alt(PARAVIRT_CALL)            \
+                                post,                                  \
+                                ARGS(call_clbr),                       \
+                                ARGS(paravirt_type(op),                \
+                                     paravirt_clobber(clbr),           \
+                                     ##__VA_ARGS__),                   \
+                                ARGS("memory", "cc" extra_clbr));      \
                        __ret = (rettype)(__eax & PVOP_RETMASK(rettype));       
\
                }                                                       \
                __ret;                                                  \
@@ -578,14 +576,14 @@ int paravirt_disable_iospace(void);
        ({                                                              \
                PVOP_VCALL_ARGS;                                        \
                PVOP_TEST_NULL(op);                                     \
-               asm volatile(pre                                        \
-                            paravirt_alt(PARAVIRT_CALL)                \
-                            post                                       \
-                            : call_clbr, "+r" (__sp)                   \
-                            : paravirt_type(op),                       \
-                              paravirt_clobber(clbr),                  \
-                              ##__VA_ARGS__                            \
-                            : "memory", "cc" extra_clbr);              \
+               ASM_CALL(pre                                            \
+                        paravirt_alt(PARAVIRT_CALL)                    \
+                        post,                                          \
+                        ARGS(call_clbr),                               \
+                        ARGS(paravirt_type(op),                        \
+                             paravirt_clobber(clbr),                   \
+                             ##__VA_ARGS__),                           \
+                        ARGS("memory", "cc" extra_clbr));              \
        })
 
 #define __PVOP_VCALL(op, pre, post, ...)                               \
diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index ec1f3c651150..19a034cbde37 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -102,16 +102,14 @@ static __always_inline bool should_resched(int 
preempt_offset)
   extern asmlinkage void ___preempt_schedule(void);
 # define __preempt_schedule()                                  \
 ({                                                             \
-       register void *__sp asm(_ASM_SP);                       \
-       asm volatile ("call ___preempt_schedule" : "+r"(__sp)); \
+       ASM_CALL("call ___preempt_schedule",,);                 \
 })
 
   extern asmlinkage void preempt_schedule(void);
   extern asmlinkage void ___preempt_schedule_notrace(void);
 # define __preempt_schedule_notrace()                                  \
 ({                                                                     \
-       register void *__sp asm(_ASM_SP);                               \
-       asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp)); \
+       ASM_CALL("call ___preempt_schedule_notrace",,);                 \
 })
   extern asmlinkage void preempt_schedule_notrace(void);
 #endif
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 550bc4ab0df4..2bbfb777c8bb 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -677,20 +677,19 @@ static inline void sync_core(void)
         * Like all of Linux's memory ordering operations, this is a
         * compiler barrier as well.
         */
-       register void *__sp asm(_ASM_SP);
 
 #ifdef CONFIG_X86_32
-       asm volatile (
+       ASM_CALL(
                "pushfl\n\t"
                "pushl %%cs\n\t"
                "pushl $1f\n\t"
                "iret\n\t"
-               "1:"
-               : "+r" (__sp) : : "memory");
+               "1:",
+               , , "memory");
 #else
        unsigned int tmp;
 
-       asm volatile (
+       ASM_CALL(
                UNWIND_HINT_SAVE
                "mov %%ss, %0\n\t"
                "pushq %q0\n\t"
@@ -702,8 +701,8 @@ static inline void sync_core(void)
                "pushq $1f\n\t"
                "iretq\n\t"
                UNWIND_HINT_RESTORE
-               "1:"
-               : "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
+               "1:",
+               "=&r" (tmp), , ARGS("cc", "memory"));
 #endif
 }
 
diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index a34e0d4b957d..9d20f4fb5d7c 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -99,25 +99,24 @@ static inline bool __down_read_trylock(struct rw_semaphore 
*sem)
 /*
  * lock for writing
  */
-#define ____down_write(sem, slow_path)                 \
-({                                                     \
-       long tmp;                                       \
-       struct rw_semaphore* ret;                       \
-       register void *__sp asm(_ASM_SP);               \
-                                                       \
-       asm volatile("# beginning down_write\n\t"       \
-                    LOCK_PREFIX "  xadd      %1,(%4)\n\t"      \
-                    /* adds 0xffff0001, returns the old value */ \
-                    "  test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" 
\
-                    /* was the active mask 0 before? */\
-                    "  jz        1f\n"                 \
-                    "  call " slow_path "\n"           \
-                    "1:\n"                             \
-                    "# ending down_write"              \
-                    : "+m" (sem->count), "=d" (tmp), "=a" (ret), "+r" (__sp) \
-                    : "a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS) \
-                    : "memory", "cc");                 \
-       ret;                                            \
+#define ____down_write(sem, slow_path)                                 \
+({                                                                     \
+       long tmp;                                                       \
+       struct rw_semaphore* ret;                                       \
+                                                                       \
+       ASM_CALL("# beginning down_write\n\t"                           \
+                LOCK_PREFIX "  xadd      %1,(%3)\n\t"                  \
+                /* adds 0xffff0001, returns the old value */           \
+                "  test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \
+                /* was the active mask 0 before? */                    \
+                "  jz        1f\n"                                     \
+                "  call " slow_path "\n"                               \
+                "1:\n"                                                 \
+                "# ending down_write",                                 \
+                ARGS("+m" (sem->count), "=d" (tmp), "=a" (ret)),       \
+                ARGS("a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS)),        \
+                ARGS("memory", "cc"));                                 \
+       ret;                                                            \
 })
 
 static inline void __down_write(struct rw_semaphore *sem)
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 184eb9894dae..c7be076ee703 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -166,12 +166,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 
0ULL, 0UL))
 ({                                                                     \
        int __ret_gu;                                                   \
        register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);            \
-       register void *__sp asm(_ASM_SP);                               \
        __chk_user_ptr(ptr);                                            \
        might_fault();                                                  \
-       asm volatile("call __get_user_%P4"                              \
-                    : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp)    \
-                    : "0" (ptr), "i" (sizeof(*(ptr))));                \
+       ASM_CALL("call __get_user_%P3",                                 \
+                ARGS("=a" (__ret_gu), "=r" (__val_gu)),                \
+                ARGS("0" (ptr), "i" (sizeof(*(ptr)))));                \
        (x) = (__force __typeof__(*(ptr))) __val_gu;                    \
        __builtin_expect(__ret_gu, 0);                                  \
 })
diff --git a/arch/x86/include/asm/uaccess_64.h 
b/arch/x86/include/asm/uaccess_64.h
index b16f6a1d8b26..8b63e2cb1eaf 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -38,10 +38,9 @@ copy_user_generic(void *to, const void *from, unsigned len)
                         X86_FEATURE_REP_GOOD,
                         copy_user_enhanced_fast_string,
                         X86_FEATURE_ERMS,
-                        ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
-                                    "=d" (len)),
-                        "1" (to), "2" (from), "3" (len)
-                        : "memory", "rcx", "r8", "r9", "r10", "r11");
+                        ARGS("=a" (ret), "=D" (to), "=S" (from), "=d" (len)),
+                        ARGS("1" (to), "2" (from), "3" (len)),
+                        ARGS("memory", "rcx", "r8", "r9", "r10", "r11"));
        return ret;
 }
 
diff --git a/arch/x86/include/asm/xen/hypercall.h 
b/arch/x86/include/asm/xen/hypercall.h
index 11071fcd630e..2feddeeec1d1 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -114,9 +114,8 @@ extern struct { char _entry[32]; } hypercall_page[];
        register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \
        register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \
        register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5; \
-       register void *__sp asm(_ASM_SP);
 
-#define __HYPERCALL_0PARAM     "=r" (__res), "+r" (__sp)
+#define __HYPERCALL_0PARAM     "=r" (__res)
 #define __HYPERCALL_1PARAM     __HYPERCALL_0PARAM, "+r" (__arg1)
 #define __HYPERCALL_2PARAM     __HYPERCALL_1PARAM, "+r" (__arg2)
 #define __HYPERCALL_3PARAM     __HYPERCALL_2PARAM, "+r" (__arg3)
@@ -146,10 +145,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 ({                                                                     \
        __HYPERCALL_DECLS;                                              \
        __HYPERCALL_0ARG();                                             \
-       asm volatile (__HYPERCALL                                       \
-                     : __HYPERCALL_0PARAM                              \
-                     : __HYPERCALL_ENTRY(name)                         \
-                     : __HYPERCALL_CLOBBER0);                          \
+       ASM_CALL(__HYPERCALL,                                           \
+                __HYPERCALL_0PARAM,                                    \
+                __HYPERCALL_ENTRY(name),                               \
+                __HYPERCALL_CLOBBER0);                                 \
        (type)__res;                                                    \
 })
 
@@ -157,10 +156,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 ({                                                                     \
        __HYPERCALL_DECLS;                                              \
        __HYPERCALL_1ARG(a1);                                           \
-       asm volatile (__HYPERCALL                                       \
-                     : __HYPERCALL_1PARAM                              \
-                     : __HYPERCALL_ENTRY(name)                         \
-                     : __HYPERCALL_CLOBBER1);                          \
+       ASM_CALL(__HYPERCALL,                                           \
+                __HYPERCALL_1PARAM,                                    \
+                __HYPERCALL_ENTRY(name),                               \
+                __HYPERCALL_CLOBBER1);                                 \
        (type)__res;                                                    \
 })
 
@@ -168,10 +167,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 ({                                                                     \
        __HYPERCALL_DECLS;                                              \
        __HYPERCALL_2ARG(a1, a2);                                       \
-       asm volatile (__HYPERCALL                                       \
-                     : __HYPERCALL_2PARAM                              \
-                     : __HYPERCALL_ENTRY(name)                         \
-                     : __HYPERCALL_CLOBBER2);                          \
+       ASM_CALL(__HYPERCALL,                                           \
+                __HYPERCALL_2PARAM,                                    \
+                __HYPERCALL_ENTRY(name),                               \
+                __HYPERCALL_CLOBBER2);                                 \
        (type)__res;                                                    \
 })
 
@@ -179,10 +178,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 ({                                                                     \
        __HYPERCALL_DECLS;                                              \
        __HYPERCALL_3ARG(a1, a2, a3);                                   \
-       asm volatile (__HYPERCALL                                       \
-                     : __HYPERCALL_3PARAM                              \
-                     : __HYPERCALL_ENTRY(name)                         \
-                     : __HYPERCALL_CLOBBER3);                          \
+       ASM_CALL(__HYPERCALL,                                           \
+                __HYPERCALL_3PARAM,                                    \
+                __HYPERCALL_ENTRY(name),                               \
+                __HYPERCALL_CLOBBER3);                                 \
        (type)__res;                                                    \
 })
 
@@ -190,10 +189,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 ({                                                                     \
        __HYPERCALL_DECLS;                                              \
        __HYPERCALL_4ARG(a1, a2, a3, a4);                               \
-       asm volatile (__HYPERCALL                                       \
-                     : __HYPERCALL_4PARAM                              \
-                     : __HYPERCALL_ENTRY(name)                         \
-                     : __HYPERCALL_CLOBBER4);                          \
+       ASM_CALL(__HYPERCALL,                                           \
+                __HYPERCALL_4PARAM,                                    \
+                __HYPERCALL_ENTRY(name),                               \
+                __HYPERCALL_CLOBBER4);                                 \
        (type)__res;                                                    \
 })
 
@@ -201,10 +200,10 @@ extern struct { char _entry[32]; } hypercall_page[];
 ({                                                                     \
        __HYPERCALL_DECLS;                                              \
        __HYPERCALL_5ARG(a1, a2, a3, a4, a5);                           \
-       asm volatile (__HYPERCALL                                       \
-                     : __HYPERCALL_5PARAM                              \
-                     : __HYPERCALL_ENTRY(name)                         \
-                     : __HYPERCALL_CLOBBER5);                          \
+       ASM_CALL(__HYPERCALL,                                           \
+                __HYPERCALL_5PARAM,                                    \
+                __HYPERCALL_ENTRY(name),                               \
+                __HYPERCALL_CLOBBER5);                                 \
        (type)__res;                                                    \
 })
 
@@ -218,10 +217,10 @@ privcmd_call(unsigned call,
        __HYPERCALL_5ARG(a1, a2, a3, a4, a5);
 
        stac();
-       asm volatile("call *%[call]"
-                    : __HYPERCALL_5PARAM
-                    : [call] "a" (&hypercall_page[call])
-                    : __HYPERCALL_CLOBBER5);
+       ASM_CALL("call *%[call]",
+                __HYPERCALL_5PARAM,
+               [call] "a" (&hypercall_page[call]),
+               __HYPERCALL_CLOBBER5);
        clac();
 
        return (long)__res;
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index fb0055953fbc..4d0d326029a7 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -5284,16 +5284,15 @@ static void fetch_possible_mmx_operand(struct 
x86_emulate_ctxt *ctxt,
 
 static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *))
 {
-       register void *__sp asm(_ASM_SP);
        ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
 
        if (!(ctxt->d & ByteOp))
                fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
 
-       asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n"
-           : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
-             [fastop]"+S"(fop), "+r"(__sp)
-           : "c"(ctxt->src2.val));
+       ASM_CALL("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n",
+                ARGS("+a"(ctxt->dst.val), "+d"(ctxt->src.val),
+                     [flags]"+D"(flags), [fastop]"+S"(fop)),
+                ARGS("c"(ctxt->src2.val)));
 
        ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
        if (!fop) /* exception is returned in fop variable */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ffd469ecad57..d42806ad3c9c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8671,7 +8671,6 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
 static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 {
        u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
-       register void *__sp asm(_ASM_SP);
 
        if ((exit_intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK))
                        == (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR)) {
@@ -8686,7 +8685,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu 
*vcpu)
                vector =  exit_intr_info & INTR_INFO_VECTOR_MASK;
                desc = (gate_desc *)vmx->host_idt_base + vector;
                entry = gate_offset(*desc);
-               asm volatile(
+               ASM_CALL(
 #ifdef CONFIG_X86_64
                        "mov %%" _ASM_SP ", %[sp]\n\t"
                        "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t"
@@ -8695,16 +8694,14 @@ static void vmx_handle_external_intr(struct kvm_vcpu 
*vcpu)
 #endif
                        "pushf\n\t"
                        __ASM_SIZE(push) " $%c[cs]\n\t"
-                       "call *%[entry]\n\t"
-                       :
+                       "call *%[entry]\n\t",
 #ifdef CONFIG_X86_64
-                       [sp]"=&r"(tmp),
+                       [sp]"=&r"(tmp)
 #endif
-                       "+r"(__sp)
-                       :
-                       [entry]"r"(entry),
-                       [ss]"i"(__KERNEL_DS),
-                       [cs]"i"(__KERNEL_CS)
+                       ,
+                       ARGS([entry]"r"(entry),
+                            [ss]"i"(__KERNEL_DS),
+                            [cs]"i"(__KERNEL_CS))
                        );
        }
 }
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 2a1fa10c6a98..ca642ac3a390 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -802,7 +802,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
        if (is_vmalloc_addr((void *)address) &&
            (((unsigned long)tsk->stack - 1 - address < PAGE_SIZE) ||
             address - ((unsigned long)tsk->stack + THREAD_SIZE) < PAGE_SIZE)) {
-               register void *__sp asm("rsp");
                unsigned long stack = 
this_cpu_read(orig_ist.ist[DOUBLEFAULT_STACK]) - sizeof(void *);
                /*
                 * We're likely to be running with very little stack space
@@ -814,13 +813,12 @@ no_context(struct pt_regs *regs, unsigned long error_code,
                 * and then double-fault, though, because we're likely to
                 * break the console driver and lose most of the stack dump.
                 */
-               asm volatile ("movq %[stack], %%rsp\n\t"
-                             "call handle_stack_overflow\n\t"
-                             "1: jmp 1b"
-                             : "+r" (__sp)
-                             : "D" ("kernel stack overflow (page fault)"),
-                               "S" (regs), "d" (address),
-                               [stack] "rm" (stack));
+               ASM_CALL("movq %[stack], %%rsp\n\t"
+                        "call handle_stack_overflow\n\t"
+                        "1: jmp 1b",
+                        ,
+                        ARGS("D" ("kernel stack overflow (page fault)"),
+                             "S" (regs), "d" (address), [stack] "rm" (stack)));
                unreachable();
        }
 #endif
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3f8c88e29a46..3a57f32a0bfd 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -589,4 +589,18 @@ static __always_inline void __write_once_size(volatile 
void *p, void *res, int s
        (_________p1); \
 })
 
+#define ARGS(args...) args
+
+#ifdef CONFIG_FRAME_POINTER
+
+#define ASM_CALL(str, outputs, inputs, clobbers...) \
+       asm volatile(str : outputs : inputs : "sp", ## clobbers)
+
+#else
+
+#define ASM_CALL(str, outputs, inputs, clobbers...) \
+       asm volatile(str : outputs : inputs : clobbers);
+
+#endif
+
 #endif /* __LINUX_COMPILER_H */
diff --git a/tools/objtool/Documentation/stack-validation.txt 
b/tools/objtool/Documentation/stack-validation.txt
index 6a1af43862df..766a00ebf80c 100644
--- a/tools/objtool/Documentation/stack-validation.txt
+++ b/tools/objtool/Documentation/stack-validation.txt
@@ -193,13 +193,8 @@ they mean, and suggestions for how to fix them.
 
    If it's a GCC-compiled .c file, the error may be because the function
    uses an inline asm() statement which has a "call" instruction.  An
-   asm() statement with a call instruction must declare the use of the
-   stack pointer in its output operand.  For example, on x86_64:
-
-     register void *__sp asm("rsp");
-     asm volatile("call func" : "+r" (__sp));
-
-   Otherwise the stack frame may not get created before the call.
+   asm() statement with a call instruction must use the ASM_CALL macro,
+   which forces the frame pointer to be saved before the call.
 
 
 2. file.o: warning: objtool: .text+0x53: unreachable instruction
@@ -221,7 +216,7 @@ they mean, and suggestions for how to fix them.
    change ENDPROC to END.
 
 
-4. file.o: warning: objtool: func(): can't find starting instruction
+3. file.o: warning: objtool: func(): can't find starting instruction
    or
    file.o: warning: objtool: func()+0x11dd: can't decode instruction
 
@@ -230,7 +225,7 @@ they mean, and suggestions for how to fix them.
    section like .data or .rodata.
 
 
-5. file.o: warning: objtool: func()+0x6: unsupported instruction in callable 
function
+4. file.o: warning: objtool: func()+0x6: unsupported instruction in callable 
function
 
    This is a kernel entry/exit instruction like sysenter or iret.  Such
    instructions aren't allowed in a callable function, and are most
@@ -239,7 +234,7 @@ they mean, and suggestions for how to fix them.
    annotated with the unwind hint macros in asm/unwind_hints.h.
 
 
-6. file.o: warning: objtool: func()+0x26: sibling call from callable 
instruction with modified stack frame
+5. file.o: warning: objtool: func()+0x26: sibling call from callable 
instruction with modified stack frame
 
    This is a dynamic jump or a jump to an undefined symbol.  Objtool
    assumed it's a sibling call and detected that the frame pointer
@@ -253,7 +248,7 @@ they mean, and suggestions for how to fix them.
    the unwind hint macros in asm/unwind_hints.h.
 
 
-7. file: warning: objtool: func()+0x5c: stack state mismatch
+6. file: warning: objtool: func()+0x5c: stack state mismatch
 
    The instruction's frame pointer state is inconsistent, depending on
    which execution path was taken to reach the instruction.
@@ -270,7 +265,7 @@ they mean, and suggestions for how to fix them.
    asm/unwind_hints.h.
 
 
-8. file.o: warning: objtool: funcA() falls through to next function funcB()
+7. file.o: warning: objtool: funcA() falls through to next function funcB()
 
    This means that funcA() doesn't end with a return instruction or an
    unconditional jump, and that objtool has determined that the function

Reply via email to