On 2018-05-17 19:19:43 [+0100], Dave Martin wrote: > On Thu, May 17, 2018 at 02:40:06PM +0200, Sebastian Andrzej Siewior wrote: > > In v4.16-RT I noticed a number of warnings from task_fpsimd_load(). The > > code disables BH and expects that it is not preemptible. On -RT the > > task remains preemptible but remains the same CPU. This may corrupt the > > Also, watch out for [1] which adds more of this stuff for KVM. This > not merged yet, but likely to land in v4.18.
yay. I will try to keep this in mind while moving forward. > Anyway: > > What we're really trying to achieve with the local_bh_disable/enable > stuff is exclusive access to the CPU FPSIMD registers and associated > metadata that tracks who they belong to. I assumed this. > > The preempt_disable() + local_bh_enable() combo in kernel_neon_begin() > > is not working on -RT. We don't use NEON in kernel mode on RT right now > > but this still should be addressed. > > I think we effectively have two levels of locking here. > > At the outer level, we want exclusive access to the FPSIMD registers. > This is what is needed between kernel_neon_begin() and > kernel_neon_end(), and maps onto the preempt_disable()/_enable() done > by these functions. > > In context switch critical code, that's insufficient, and we also > need exclusive access to the metadata that tracks which task or context > owns the FPSIMD registers. This is what the local_bh_disable()/ > _enable() achieves. > > > So does it make sense to have two locks (I'm assuming local locks are > implicitly percpu ?) > > static inline void local_fpsimd_context_lock(void) > { > local_bh_disable(); > local_lock(fpsimd_lock); > local_lock(fpsimd_context_lock); > } > > static inline void local_fpsimd_context_unlock(void) > { > local_unlock(fpsimd_context_lock); > local_unlock(fpsimd_lock); > local_bh_enable(); > } > > > kernel_neon_begin() could then do > > local_fpsimd_context_lock(); > > /* ... */ > > preempt_disable(); > local_unlock(fpsimd_context_lock); > > ... with the following in kernel_neon_end(): > > local_unlock(fpsimd_lock); > preempt_enable(); > > > If kernel-mode NEON was considered harmful to RT due to the context > switch overheads, then the above might be overkill. SVE will be worse > in that regard, and also needs thinking about at some point -- I've not > looked at if from the RT angle at all. > > In either case, I think abstracting the lock/unlock sequences out to > make the purpose clearer may be a good idea, even if we just have a > single local lock to manage. So the two looks you suggested make sense. However I would need to replace this preempt_disable() with one such lock. A local_lock() on -RT is a per-CPU spin_lock. RT's local_lock gets currently transformed into preempt_disable() on !RT configs. > There is one place where I mess with the FPSIMD context no lock held > because of a need to copy_from_user() straight into the context backing > store (we can't copy_from_user() with preemption disabled...) > I'm not sure whether RT will have any impact on this, but it probably > needs thinking about. if no locks are held and the task can be preempted then it also might happen on PREEMPT config. But copy_from_user() doesn't sounds like something that will go straight to HW. > One more comment below... > > > > > Signed-off-by: Sebastian Andrzej Siewior <bige...@linutronix.de> > > --- > > arch/arm64/kernel/fpsimd.c | 20 ++++++++++++++++++-- > > 1 file changed, 18 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > > index e7226c4c7493..3a5cd1908874 100644 > > --- a/arch/arm64/kernel/fpsimd.c > > +++ b/arch/arm64/kernel/fpsimd.c > > @@ -38,6 +38,7 @@ > > #include <linux/signal.h> > > #include <linux/slab.h> > > #include <linux/sysctl.h> > > +#include <linux/locallock.h> > > > > #include <asm/fpsimd.h> > > #include <asm/cputype.h> > > @@ -235,7 +236,7 @@ static void sve_user_enable(void) > > * whether TIF_SVE is clear or set, since these are not vector length > > * dependent. > > */ > > - > > +static DEFINE_LOCAL_IRQ_LOCK(fpsimd_lock); > > /* > > * Update current's FPSIMD/SVE registers from thread_struct. > > * > > @@ -594,6 +595,7 @@ int sve_set_vector_length(struct task_struct *task, > > * non-SVE thread. > > */ > > if (task == current) { > > + local_lock(fpsimd_lock); > > local_bh_disable(); > > > > task_fpsimd_save(); > > @@ -604,8 +606,10 @@ int sve_set_vector_length(struct task_struct *task, > > if (test_and_clear_tsk_thread_flag(task, TIF_SVE)) > > sve_to_fpsimd(task); > > > > - if (task == current) > > + if (task == current) { > > + local_unlock(fpsimd_lock); > > Is this misordered against local_bh_enable(), or doesn't it matter? It actually is but it does not matter. On RT local_bh_disable() is turned into migrate_disable() what in turn disables the migration of the task to another CPU (while it is still preemtible). This migrate_disable() is also part of local_lock(). Maybe I will add a local_lock_bh() and replace this with this local_bh_disable() so it will better :) > > local_bh_enable(); > > + } Sebastian