On Wed, Mar 24, 2021 at 10:14:52AM +0000, guo...@kernel.org wrote: > +static inline void arch_spin_lock(arch_spinlock_t *lock) > +{ > + arch_spinlock_t lockval; > + u32 tmp; > + > + asm volatile ( > + "1: lr.w %0, %2 \n" > + " mv %1, %0 \n" > + " addw %0, %0, %3 \n" > + " sc.w %0, %0, %2 \n" > + " bnez %0, 1b \n" > + : "=&r" (tmp), "=&r" (lockval), "+A" (lock->lock) > + : "r" (1 << TICKET_NEXT) > + : "memory"); > > + while (lockval.tickets.next != lockval.tickets.owner) { > + /* > + * FIXME - we need wfi/wfe here to prevent: > + * - cache line bouncing > + * - saving cpu pipeline in multi-harts-per-core > + * processor > + */ > + lockval.tickets.owner = READ_ONCE(lock->tickets.owner); > + } > > + __atomic_acquire_fence(); > }
> +static inline void arch_spin_unlock(arch_spinlock_t *lock) > { > + smp_store_release(&lock->tickets.owner, lock->tickets.owner + 1); > + /* FIXME - we need ipi/sev here to notify above */ > } Urgh, are you saying your WFE requires an explicit SEV like on ARM ? The ARM64 model is far superious IMO, and then you can use smp_cond_load_acquire() in arch_spin_lock() and call it a day.