See pseudo-patch below.  That cures the reported gcc gripeage.

On Sun, 2018-07-15 at 09:22 +0200, Mike Galbraith wrote:
> On Sat, 2018-07-14 at 00:03 +0200, Mike Galbraith wrote:
> > On Fri, 2018-07-13 at 19:49 +0200, Sebastian Andrzej Siewior wrote:
> > > In v4.16-RT I noticed a number of warnings from task_fpsimd_load(). The
> > > code disables BH and expects that it is not preemptible. On -RT the
> > > task remains preemptible but remains the same CPU. This may corrupt the
> > > content of the SIMD registers if the task is preempted during
> > > saving/restoring those registers.
> > > Add a locallock around this process. This avoids that the any function
> > > within the locallock block is invoked more than once on the same CPU.
> > > 
> > > The kernel_neon_begin() can't be kept preemptible. If the task-switch 
> > > notices
> > > TIF_FOREIGN_FPSTATE then it would restore task's SIMD state and we lose 
> > > the
> > > state of registers used for in-kernel-work. We would require additional 
> > > storage
> > > for the in-kernel copy of the registers. But then the NEON-crypto checks 
> > > for
> > > the need-resched flag so it shouldn't that bad.
> > > The preempt_disable() avoids the context switch while the kernel uses the 
> > > SIMD
> > > registers. Unfortunately we have to balance out the migrate_disable() 
> > > counter
> > > because local_lock_bh() is invoked in different context compared to its 
> > > unlock
> > > counterpart.
> > > 
> > > __efi_fpsimd_begin() should not use kernel_fpu_begin() due to its
> > > preempt_disable() context and instead save the registers always in its
> > > extra spot on RT.
> > > 
> > > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> > > ---
> > > 
> > > This seems to make work (crypto chacha20-neon + cyclictest). I have no
> > > EFI so I have no clue if saving SIMD while calling to EFI works.
> > 
> > All is not well on cavium test box.  I'm seeing random errors ala...
> > 
> > ./include/linux/fs.h:3137:11: internal compiler error: Segmentation fault
> > ./include/linux/bio.h:175:1: internal compiler error: in grokdeclarator, at 
> > c/c-decl.c:7023
> > 
> > ...during make -j96 (2*cpus) kbuild.  Turns out 4.14-rt has this issue
> > as well, which is unsurprising if it's related to fpsimd woes.  Box
> > does not exhibit the issue with NONRT kernels, PREEMPT or NOPREEMPT.
> 
> Verified to be SIMD woes.  I backported your V2 to 4.14-rt, and the
> CPUS*2 kbuild still reliably reproduced the corruption issue.  I then
> did the below to both 4.14-rt and 4.16-rt, and the corruption is gone.
> 
> (this looks a bit like a patch, but is actually a functional yellow
> sticky should I need to come back for another poke at it later)
> 
> arm64: fpsimd: disable preemption for RT where that is assumed
> 
> 1. Per Sebastian's analysis, kernel_neon_begin() can't be made preemptible:
> If the task-switch notices TIF_FOREIGN_FPSTATE then it would restore task's
> SIMD state and we lose the state of registers used for in-kernel-work.  We
> would require additional storage for the in-kernel copy of the registers.
> But then the NEON-crypto checks for the need-resched flag so it shouldn't
> that bad.
> 
> 2. arch_efi_call_virt_setup/teardown() encapsulate __efi_fpsimd_begin/end()
> in preempt disabled sections via efi_virtmap_load/unload().  That could be
> fixed, but... 
> 
> 3. A local lock solution which left preempt disabled sections 1 & 2 intact
> failed, CPUS*2 parallel kbuild reliably reproduced memory corruption.
> 
> Given the two non-preemptible sections which could encapsulate something
> painful remained intact with the local lock solution, and the fact that
> the remaining BH disabled sections are all small, with empirical evidence
> at hand that at LEAST one truely does require preemption be disabled,
> the best solution for both RT and !RT is to simply disable preemption for
> RT where !RT assumes preemption has been disabled.  That adds no cycles
> to the !RT case, fewer cycles to the RT case, and requires no (ugly) work
> around for the consequences of local_unlock() under preempt_disable().
> 
> Signed-off-by: Mike Galbraith <[email protected]>
> ---
>  arch/arm64/kernel/fpsimd.c |   18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -594,6 +594,7 @@ int sve_set_vector_length(struct task_st
>        * non-SVE thread.
>        */
>       if (task == current) {
> +             preempt_disable_rt();
>               local_bh_disable();
>  
>               task_fpsimd_save();
> @@ -604,8 +605,10 @@ int sve_set_vector_length(struct task_st
>       if (test_and_clear_tsk_thread_flag(task, TIF_SVE))
>               sve_to_fpsimd(task);
>  
> -     if (task == current)
> +     if (task == current) {
>               local_bh_enable();
> +             preempt_enable_rt();
> +     }
>  
>       /*
>        * Force reallocation of task SVE state to the correct size
> @@ -837,6 +840,7 @@ asmlinkage void do_sve_acc(unsigned int
>  
>       sve_alloc(current);
>  
> +     preempt_disable_rt();
>       local_bh_disable();
>  
>       task_fpsimd_save();
> @@ -850,6 +854,7 @@ asmlinkage void do_sve_acc(unsigned int
>               WARN_ON(1); /* SVE access shouldn't have trapped */
>  
>       local_bh_enable();
> +     preempt_enable_rt();
>  }
>  
>  /*
> @@ -925,6 +930,7 @@ void fpsimd_flush_thread(void)
>       if (!system_supports_fpsimd())
>               return;
>  
> +     preempt_disable_rt();
>       local_bh_disable();
>  
>       memset(&current->thread.fpsimd_state, 0, sizeof(struct fpsimd_state));
> @@ -968,6 +974,7 @@ void fpsimd_flush_thread(void)
>       set_thread_flag(TIF_FOREIGN_FPSTATE);
>  
>       local_bh_enable();
> +     preempt_enable_rt();
>  }
>  
>  /*
> @@ -979,9 +986,11 @@ void fpsimd_preserve_current_state(void)
>       if (!system_supports_fpsimd())
>               return;
>  
> +     preempt_disable_rt();
>       local_bh_disable();
>       task_fpsimd_save();
>       local_bh_enable();
> +     preempt_enable_rt();
>  }
>  
>  /*
> @@ -1021,6 +1030,7 @@ void fpsimd_restore_current_state(void)
>       if (!system_supports_fpsimd())
>               return;
>  
> +     preempt_disable_rt();
>       local_bh_disable();
>  
>       if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
> @@ -1029,6 +1039,7 @@ void fpsimd_restore_current_state(void)
>       }
>  
>       local_bh_enable();
> +     preempt_enable_rt();
>  }
>  
>  /*
> @@ -1041,6 +1052,7 @@ void fpsimd_update_current_state(struct
>       if (!system_supports_fpsimd())
>               return;
>  
> +     preempt_disable_rt();
>       local_bh_disable();
>  
>       current->thread.fpsimd_state.user_fpsimd = *state;
> @@ -1053,6 +1065,7 @@ void fpsimd_update_current_state(struct
>               fpsimd_bind_to_cpu();
>  
>       local_bh_enable();
> +     preempt_enable_rt();
>  }
>  
>  /*
> @@ -1115,6 +1128,7 @@ void kernel_neon_begin(void)
>  
>       BUG_ON(!may_use_simd());
>  
> +     preempt_disable();
>       local_bh_disable();
>  
>       __this_cpu_write(kernel_neon_busy, true);
> @@ -1128,8 +1142,6 @@ void kernel_neon_begin(void)
>       /* Invalidate any task state remaining in the fpsimd regs: */
>       fpsimd_flush_cpu_state();
>  
> -     preempt_disable();
> -
>       local_bh_enable();
>  }
>  EXPORT_SYMBOL(kernel_neon_begin);

Reply via email to