Mark notes that gcc optimization passes have the potential to elide necessary invocations of this instruction sequence, so mark the asm volaltile.
--- >From Mark: The volatile will inhibit *some* cases where the compiler could lift the array_index_nospec() call out of a branch, e.g. where there are multiple invocations of array_index_nospec() with the same arguments: if (idx < foo) { idx1 = array_idx_nospec(idx, foo) do_something(idx1); } < some other code > if (idx < foo) { idx2 = array_idx_nospec(idx, foo); do_something_else(idx2); } ... since the compiler can determine that the two invocations yield the same result, and reuse the first result (likely the same register as idx was in originally) for the second branch, effectively re-writing the above as: if (idx < foo) { idx = array_idx_nospec(idx, foo); do_something(idx); } < some other code > if (idx < foo) { do_something_else(idx); } ... if we don't take the first branch, then speculatively take the second, we lose the nospec protection. There's more info on volatile asm in the GCC docs: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile Cc: <sta...@vger.kernel.org> Fixes: babdde2698d4 ("x86: Implement array_index_mask_nospec") Cc: Thomas Gleixner <t...@linutronix.de> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Linus Torvalds <torva...@linux-foundation.org> Cc: Ingo Molnar <mi...@kernel.org> Reported-by: Mark Rutland <mark.rutl...@arm.com> Acked-by: Mark Rutland <mark.rutl...@arm.com> Signed-off-by: Dan Williams <dan.j.willi...@intel.com> --- Changes in v2: * drop the barrier() call (Mark) * update the example (Mark) arch/x86/include/asm/barrier.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index 042b5e892ed1..14de0432d288 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -38,7 +38,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, { unsigned long mask; - asm ("cmp %1,%2; sbb %0,%0;" + asm volatile ("cmp %1,%2; sbb %0,%0;" :"=r" (mask) :"g"(size),"r" (index) :"cc");