Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi Dave, On 4/11/19 5:34 PM, Dave Martin wrote: On Thu, Apr 11, 2019 at 04:58:41PM +0100, Julien Grall wrote: Hi Dave, On 4/5/19 4:07 PM, Dave Martin wrote: On Fri, Apr 05, 2019 at 10:02:45AM +0100, Julien Grall wrote: +#ifdef CONFIG_KERNEL_MODE_NEON + /* * may_use_simd - whether it is allowable at this time to issue SIMD *instructions or access the SIMD register file diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 5ebe73b69961..b7e5dac26190 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -90,7 +90,8 @@ * To prevent this from racing with the manipulation of the task's FPSIMD state * from task context and thereby corrupting the state, it is necessary to * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE - * flag with local_bh_disable() unless softirqs are already masked. + * flag with kernel_neon_{disable, enable}. This will still allow softirqs to + * run but prevent them to use FPSIMD. * * For a certain task, the sequence may look something like this: * - the task gets scheduled in; if both the task's fpsimd_cpu field @@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state; #endif /* ! CONFIG_ARM64_SVE */ +static void kernel_neon_disable(void); +static void kernel_neon_enable(void); I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE context access for the current context (i.e., makes it safe). Since these also disable/enable preemption, perhaps we can align them with the existing get_cpu()/put_cpu(), something like: void get_cpu_fpsimd_context(); vold put_cpu_fpsimd_context(); If you consider it's worth adding the checking helper I alluded to above, it could then be called something like: bool have_cpu_fpsimd_context(); I am not sure where you suggested a checking helper above. Do you refer to exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON? Hmmm, looks like I got my reply out of order here. I meant the helper (if any) to check !preemptible() && !__this_cpu_read(kernel_neon_busy). I guess you are using && instead of || because some of the caller may not call get_cpu_fpsimd_context() before but still disable preemption, right? Wouldn't it be better to have all the user calling get_cpu_fpsimd_context() and put_cpu_fpsimd_context()? This has the advantage to uniformize how the way FPSIMD is protected and also... My expectation is that all users would have called get_cpu_fpsimd_context(). This is not the case today (see kvm_arch_vcpu_put_fp), I will look at protecting it with a call to get_cpu_fpsimd_context(). The reason for writing the check that way is that we can't meaningfully inspect percpu variables unless we are non-preemptible already. The && means we don't do the percpu read at all is the case where preemptible() is true. I am not sure to understand why it would be necessary. this_cpu_read(kernel_neon_busy) should be sufficient here. If it is set then, preemption is disabled. Or are you worried about user directly setting kernel_neon_busy instead of calling get_cpu_fpsimd_context? Or do you think my logic is wrong somewhere? (It's possible...) I think your logic would not return the correct value. We want have_cpu_fpsimd_context() to return true if it is not preemptible and kernel_neon_busy is true. So we would want: !preemptible() && __this_cpu_read(kernel_neon_busy) If we speak about the implementation of have_cpu_fpsimd_context(), then we want: !preemptible() && __this_cpu_read(kernel_neon_busy) Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On Thu, Apr 11, 2019 at 04:58:41PM +0100, Julien Grall wrote: > Hi Dave, > > On 4/5/19 4:07 PM, Dave Martin wrote: > >On Fri, Apr 05, 2019 at 10:02:45AM +0100, Julien Grall wrote: > +#ifdef CONFIG_KERNEL_MODE_NEON > + > /* > * may_use_simd - whether it is allowable at this time to issue SIMD > *instructions or access the SIMD register file > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > index 5ebe73b69961..b7e5dac26190 100644 > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -90,7 +90,8 @@ > * To prevent this from racing with the manipulation of the task's > FPSIMD state > * from task context and thereby corrupting the state, it is necessary > to > * protect any manipulation of a task's fpsimd_state or > TIF_FOREIGN_FPSTATE > - * flag with local_bh_disable() unless softirqs are already masked. > + * flag with kernel_neon_{disable, enable}. This will still allow > softirqs to > + * run but prevent them to use FPSIMD. > * > * For a certain task, the sequence may look something like this: > * - the task gets scheduled in; if both the task's fpsimd_cpu field > @@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state; > #endif /* ! CONFIG_ARM64_SVE */ > +static void kernel_neon_disable(void); > +static void kernel_neon_enable(void); > >>> > >>>I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE > >>>context access for the current context (i.e., makes it safe). > >>> > >>>Since these also disable/enable preemption, perhaps we can align them > >>>with the existing get_cpu()/put_cpu(), something like: > >>> > >>>void get_cpu_fpsimd_context(); > >>>vold put_cpu_fpsimd_context(); > >>> > >>>If you consider it's worth adding the checking helper I alluded to > >>>above, it could then be called something like: > >>> > >>>bool have_cpu_fpsimd_context(); > >> > >>I am not sure where you suggested a checking helper above. Do you refer to > >>exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON? > > > >Hmmm, looks like I got my reply out of order here. > > > >I meant the helper (if any) to check > >!preemptible() && !__this_cpu_read(kernel_neon_busy). > > I guess you are using && instead of || because some of the caller may not > call get_cpu_fpsimd_context() before but still disable preemption, right? > > Wouldn't it be better to have all the user calling get_cpu_fpsimd_context() > and put_cpu_fpsimd_context()? > > This has the advantage to uniformize how the way FPSIMD is protected and > also... My expectation is that all users would have called get_cpu_fpsimd_context(). The reason for writing the check that way is that we can't meaningfully inspect percpu variables unless we are non-preemptible already. The && means we don't do the percpu read at all is the case where preemptible() is true. Or do you think my logic is wrong somewhere? (It's possible...) > >Looks like you inferred what I meant later on anyway. > > > >> > >>> > + > /* > * Call __sve_free() directly only if you know task can't be scheduled > * or preempted. > @@ -213,11 +217,11 @@ static void sve_free(struct task_struct *task) > * thread_struct is known to be up to date, when preparing to enter > * userspace. > * > - * Softirqs (and preemption) must be disabled. > + * Preemption must be disabled. > >>> > >>>[*] That's not enough: we need to be in kernel_neon_disable()... > >>>_enable() (i.e., kernel_neon_busy needs to be true to prevent softirqs > >>>messing with the FPSIMD state). > >> > >>How about not mentioning preemption at all and just say: > >> > >>"The fpsimd context should be acquired before hand". > >> > >>This would help if we ever decide to protect critical section differently. > > > >Yes, or even better, name the function used to do this (i.e., > >kernel_neon_disable() or get_cpu_fpsimd_context() or whatever it's > >called). > > ... would make the comments simpler because we would have only one possible > case to care. Agreed ---Dave
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi Dave, On 4/5/19 4:07 PM, Dave Martin wrote: On Fri, Apr 05, 2019 at 10:02:45AM +0100, Julien Grall wrote: +#ifdef CONFIG_KERNEL_MODE_NEON + /* * may_use_simd - whether it is allowable at this time to issue SIMD *instructions or access the SIMD register file diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 5ebe73b69961..b7e5dac26190 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -90,7 +90,8 @@ * To prevent this from racing with the manipulation of the task's FPSIMD state * from task context and thereby corrupting the state, it is necessary to * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE - * flag with local_bh_disable() unless softirqs are already masked. + * flag with kernel_neon_{disable, enable}. This will still allow softirqs to + * run but prevent them to use FPSIMD. * * For a certain task, the sequence may look something like this: * - the task gets scheduled in; if both the task's fpsimd_cpu field @@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state; #endif /* ! CONFIG_ARM64_SVE */ +static void kernel_neon_disable(void); +static void kernel_neon_enable(void); I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE context access for the current context (i.e., makes it safe). Since these also disable/enable preemption, perhaps we can align them with the existing get_cpu()/put_cpu(), something like: void get_cpu_fpsimd_context(); vold put_cpu_fpsimd_context(); If you consider it's worth adding the checking helper I alluded to above, it could then be called something like: bool have_cpu_fpsimd_context(); I am not sure where you suggested a checking helper above. Do you refer to exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON? Hmmm, looks like I got my reply out of order here. I meant the helper (if any) to check !preemptible() && !__this_cpu_read(kernel_neon_busy). I guess you are using && instead of || because some of the caller may not call get_cpu_fpsimd_context() before but still disable preemption, right? Wouldn't it be better to have all the user calling get_cpu_fpsimd_context() and put_cpu_fpsimd_context()? This has the advantage to uniformize how the way FPSIMD is protected and also... Looks like you inferred what I meant later on anyway. + /* * Call __sve_free() directly only if you know task can't be scheduled * or preempted. @@ -213,11 +217,11 @@ static void sve_free(struct task_struct *task) * thread_struct is known to be up to date, when preparing to enter * userspace. * - * Softirqs (and preemption) must be disabled. + * Preemption must be disabled. [*] That's not enough: we need to be in kernel_neon_disable()... _enable() (i.e., kernel_neon_busy needs to be true to prevent softirqs messing with the FPSIMD state). How about not mentioning preemption at all and just say: "The fpsimd context should be acquired before hand". This would help if we ever decide to protect critical section differently. Yes, or even better, name the function used to do this (i.e., kernel_neon_disable() or get_cpu_fpsimd_context() or whatever it's called). ... would make the comments simpler because we would have only one possible case to care. Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi Sebastian, On 4/5/19 4:42 PM, Sebastian Andrzej Siewior wrote: On 2019-04-05 16:17:50 [+0100], Julien Grall wrote: Hi, Hi, A per-CPU lock? It has to be a raw_spinlock_t because a normal spin_lock() / local_lock() would allow scheduling and might be taken as part of the context switch or soon after. raw_spinlock_t would not work here without disabling preemption. Otherwise you may end up to recurse on the lock and therefore deadlock. But then it raise the question of the usefulness of the lock here. However, I don't really understand why allowing the scheduling would be a problem here. Is it a concern because we will waste cycle trying to restore/save a context that will be scratched as soon as we release the lock? If you hold the lock within the kernel thread and every kernel thread acquires it before doing any SIMD operations then you are good. It could be a sleeping lock. What happens if you hold the lock, are scheduled out and a user task is about to be scheduled? How do you force the kernel thread out / give up the FPU registers? You would need the thread out to finish running the critical section before the next thread to run. I was under the impression with sleeping lock, the priority of the thread out would get bumped to finish quickly the critical section. I agree it means that you will waste time restoring registers that will get trashed right after you leave the critical sections. This is probably not really efficient. Anyway, that was only a suggestion I haven't fully thought through. I have no plan to do more work than this patch in the fpsimd context switch. That preempt_disable() + local_bh_disable() might not be the pretties thing but how bad is it actually? From the measurement I did on non-RT, it is beneficial to keep the softirq enabled (see [1]). I don't have platform where FPSIMD is used in softirqs (or at least not by default). So for testing purpose, I wrote a tasklet (based on hrtimer) using the FPSIMD registers every 1ms if it is allowed (i.e may_use_simd() returns true). I let it run for a while and notice that the tasklet will be executed only 0.15% of the time when !may_use_simd(). Furthermore, from what I understood in this thread, there are few limited use cases where FPSIMD will be used in softirqs. So it seems better to me to avoid disabling softirqs at least in non-RT. For the RT, aren't all the softirqs handled in a thread? So what would be the benefits to disable softirqs if we already disable preemption? In any case, this patch introduces new helpers (get_cpu_fpsimd_context and put_cpu_fpsimd_context) to delimit regions using the HW FPSIMD. So it would be easy to modify the behavior if we wanted too. Latency wise you can't schedule(). From RT point of view you need to enable preemption while going from page to page because of the possible kmap() or kmalloc() (on baldy aligned src/dst) with the crypto's page-walk code. Make sense. However I don't think we can keep enable the preemption with the current implementation of FPSIMD context switch. I noticed you disabled crypto for Arm64 because of allocations, I have a todo to look at what we can do. Cheers, [1] https://marc.info/?l=linux-rt-users&m=155499183812211&w=2 Sebastian -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi Dave, On 4/4/19 11:52 AM, Dave Martin wrote: On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote: I'm not sure how this patch will affect context switch overhead, so it would be good to see hackbench numbers (or similar). I finally have some numbers for this patch. The benchmark was ran using Linux 5.1-rc4 and defconfig. On Juno2: * hackbench 100 process 1000 (10 times) * .7% quicker On ThunderX 2: * hackbench 1000 process 1000 (20 times) * 3.4% quicker I am happy to try other benchmark if you think it is useful. Anyway, I will resend the series with the comments addressed. Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On 2019-04-05 16:17:50 [+0100], Julien Grall wrote: > Hi, Hi, > > A per-CPU lock? It has to be a raw_spinlock_t because a normal > > spin_lock() / local_lock() would allow scheduling and might be taken as > > part of the context switch or soon after. > raw_spinlock_t would not work here without disabling preemption. Otherwise > you may end up to recurse on the lock and therefore deadlock. But then it > raise the question of the usefulness of the lock here. > > However, I don't really understand why allowing the scheduling would be a > problem here. Is it a concern because we will waste cycle trying to > restore/save a context that will be scratched as soon as we release the > lock? If you hold the lock within the kernel thread and every kernel thread acquires it before doing any SIMD operations then you are good. It could be a sleeping lock. What happens if you hold the lock, are scheduled out and a user task is about to be scheduled? How do you force the kernel thread out / give up the FPU registers? That preempt_disable() + local_bh_disable() might not be the pretties thing but how bad is it actually? Latency wise you can't schedule(). From RT point of view you need to enable preemption while going from page to page because of the possible kmap() or kmalloc() (on baldy aligned src/dst) with the crypto's page-walk code. If that is not good enough latency wise you could do kernel_fpu_resched() after a few iterations. Currently I'm trying to get kernel_fpu_begin()/end() cheap on x86 so it doesn't always store/restore the FPU context. Then kernel_fpu_resched() shouldn't be that bad. > Cheers, Sebastian
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi, On 05/04/2019 15:39, Sebastian Andrzej Siewior wrote: On 2019-04-05 10:02:45 [+0100], Julien Grall wrote: RT folks already saw this corruption because local_bh_disable() does not preempt on RT. They are carrying a patch (see "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()") to disable preemption along with local_bh_disable(). Alternatively, Julia suggested to introduce a per-cpu lock to protect the state. I am thinking to defer this for a follow-up patch. The changes in this patch should make it easier because we now have helper to mark the critical section. A per-CPU lock? It has to be a raw_spinlock_t because a normal spin_lock() / local_lock() would allow scheduling and might be taken as part of the context switch or soon after. raw_spinlock_t would not work here without disabling preemption. Otherwise you may end up to recurse on the lock and therefore deadlock. But then it raise the question of the usefulness of the lock here. However, I don't really understand why allowing the scheduling would be a problem here. Is it a concern because we will waste cycle trying to restore/save a context that will be scratched as soon as we release the lock? Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On Fri, Apr 05, 2019 at 10:02:45AM +0100, Julien Grall wrote: > Hi Dave, > > Thank you for the review. > > On 4/4/19 11:52 AM, Dave Martin wrote: > >On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote: > >>For RT-linux, it might be possible to use migrate_{enable, disable}. I > >>am quite new with RT and have some trouble to understand the semantics > >>of migrate_{enable, disable}. So far, I am still unsure if it is possible > >>to run another userspace task on the same CPU while getting preempted > >>when the migration is disabled. > > > >(Leaving aside the RT aspects for now, but if migrate_{enable,disable} > >is costly it might not be the best thing to use deep in context switch > >paths, even if is technically correct...) > > Based on the discussion in this thread, migrate_enable/migrate_disable is > not suitable in this context. > > The goal of those helpers is to pin the task to the current CPU. On RT, it > will not disable the preemption. So the current task can be preempted by a > task with higher priority. > > The next task may require to use the FPSIMD and will potentially result to > corrupt the state. > > RT folks already saw this corruption because local_bh_disable() does not > preempt on RT. They are carrying a patch (see "arm64: fpsimd: use > preemp_disable in addition to local_bh_disable()") to disable preemption > along with local_bh_disable(). > > Alternatively, Julia suggested to introduce a per-cpu lock to protect the > state. I am thinking to defer this for a follow-up patch. The changes in > this patch should make it easier because we now have helper to mark the > critical section. I'll leave it for the RT folks to comment on this. (I see Sebastian already did.) > > > > >>--- > >> arch/arm64/include/asm/simd.h | 4 +-- > >> arch/arm64/kernel/fpsimd.c| 76 > >> +-- > >> 2 files changed, 46 insertions(+), 34 deletions(-) > >> > >>diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h > >>index 6495cc51246f..94c0dac508aa 100644 > >>--- a/arch/arm64/include/asm/simd.h > >>+++ b/arch/arm64/include/asm/simd.h > >>@@ -15,10 +15,10 @@ > >> #include > >> #include > >>-#ifdef CONFIG_KERNEL_MODE_NEON > >>- > >> DECLARE_PER_CPU(bool, kernel_neon_busy); > > > >Why make this unconditional? This declaration is only here for > >may_use_simd() to use. The stub version of may_use_simd() for the > >!CONFIG_KERNEL_MODE_NEON case doesn't touch it. > > kernel_neon_busy will be used in fpsimd.c even when with > !CONFIG_KERNEL_MODE_NEON. So it makes sense to also declare it in the header > even if not used. Ah yes, I missed that. We don't need all this logic in the !CONFIG_KERNEL_MODE_NEON case of course, but I'm not sure it's worth optimising that special case. Especially so if we don't see any significant impact in ctxsw-heavy benchmarks. > Another solution is to avoid expose kernel_neon_busy outside of fpsimd.c and > use an helper. Probably not worth it for now. > >>+#ifdef CONFIG_KERNEL_MODE_NEON > >>+ > >> /* > >> * may_use_simd - whether it is allowable at this time to issue SIMD > >> *instructions or access the SIMD register file > >>diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > >>index 5ebe73b69961..b7e5dac26190 100644 > >>--- a/arch/arm64/kernel/fpsimd.c > >>+++ b/arch/arm64/kernel/fpsimd.c > >>@@ -90,7 +90,8 @@ > >> * To prevent this from racing with the manipulation of the task's FPSIMD > >> state > >> * from task context and thereby corrupting the state, it is necessary to > >> * protect any manipulation of a task's fpsimd_state or > >> TIF_FOREIGN_FPSTATE > >>- * flag with local_bh_disable() unless softirqs are already masked. > >>+ * flag with kernel_neon_{disable, enable}. This will still allow softirqs > >>to > >>+ * run but prevent them to use FPSIMD. > >> * > >> * For a certain task, the sequence may look something like this: > >> * - the task gets scheduled in; if both the task's fpsimd_cpu field > >>@@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state; > >> #endif /* ! CONFIG_ARM64_SVE */ > >>+static void kernel_neon_disable(void); > >>+static void kernel_neon_enable(void); > > > >I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE > >context access for the current context (i.e., makes it safe). > > > >Since these also disable/enable preemption, perhaps we can align them > >with the existing get_cpu()/put_cpu(), something like: > > > >void get_cpu_fpsimd_context(); > >vold put_cpu_fpsimd_context(); > > > >If you consider it's worth adding the checking helper I alluded to > >above, it could then be called something like: > > > >bool have_cpu_fpsimd_context(); > > I am not sure where you suggested a checking helper above. Do you refer to > exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON? Hmmm, looks like I got my reply out of order here. I meant the helper (if any) to check !preemptible() &&
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On 2019-04-05 10:02:45 [+0100], Julien Grall wrote: > RT folks already saw this corruption because local_bh_disable() does not > preempt on RT. They are carrying a patch (see "arm64: fpsimd: use > preemp_disable in addition to local_bh_disable()") to disable preemption > along with local_bh_disable(). > > Alternatively, Julia suggested to introduce a per-cpu lock to protect the > state. I am thinking to defer this for a follow-up patch. The changes in > this patch should make it easier because we now have helper to mark the > critical section. A per-CPU lock? It has to be a raw_spinlock_t because a normal spin_lock() / local_lock() would allow scheduling and might be taken as part of the context switch or soon after. Sebastian
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi Dave, Thank you for the review. On 4/4/19 11:52 AM, Dave Martin wrote: On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote: For RT-linux, it might be possible to use migrate_{enable, disable}. I am quite new with RT and have some trouble to understand the semantics of migrate_{enable, disable}. So far, I am still unsure if it is possible to run another userspace task on the same CPU while getting preempted when the migration is disabled. (Leaving aside the RT aspects for now, but if migrate_{enable,disable} is costly it might not be the best thing to use deep in context switch paths, even if is technically correct...) Based on the discussion in this thread, migrate_enable/migrate_disable is not suitable in this context. The goal of those helpers is to pin the task to the current CPU. On RT, it will not disable the preemption. So the current task can be preempted by a task with higher priority. The next task may require to use the FPSIMD and will potentially result to corrupt the state. RT folks already saw this corruption because local_bh_disable() does not preempt on RT. They are carrying a patch (see "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()") to disable preemption along with local_bh_disable(). Alternatively, Julia suggested to introduce a per-cpu lock to protect the state. I am thinking to defer this for a follow-up patch. The changes in this patch should make it easier because we now have helper to mark the critical section. --- arch/arm64/include/asm/simd.h | 4 +-- arch/arm64/kernel/fpsimd.c| 76 +-- 2 files changed, 46 insertions(+), 34 deletions(-) diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h index 6495cc51246f..94c0dac508aa 100644 --- a/arch/arm64/include/asm/simd.h +++ b/arch/arm64/include/asm/simd.h @@ -15,10 +15,10 @@ #include #include -#ifdef CONFIG_KERNEL_MODE_NEON - DECLARE_PER_CPU(bool, kernel_neon_busy); Why make this unconditional? This declaration is only here for may_use_simd() to use. The stub version of may_use_simd() for the !CONFIG_KERNEL_MODE_NEON case doesn't touch it. kernel_neon_busy will be used in fpsimd.c even when with !CONFIG_KERNEL_MODE_NEON. So it makes sense to also declare it in the header even if not used. Another solution is to avoid expose kernel_neon_busy outside of fpsimd.c and use an helper. +#ifdef CONFIG_KERNEL_MODE_NEON + /* * may_use_simd - whether it is allowable at this time to issue SIMD *instructions or access the SIMD register file diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 5ebe73b69961..b7e5dac26190 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -90,7 +90,8 @@ * To prevent this from racing with the manipulation of the task's FPSIMD state * from task context and thereby corrupting the state, it is necessary to * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE - * flag with local_bh_disable() unless softirqs are already masked. + * flag with kernel_neon_{disable, enable}. This will still allow softirqs to + * run but prevent them to use FPSIMD. * * For a certain task, the sequence may look something like this: * - the task gets scheduled in; if both the task's fpsimd_cpu field @@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state; #endif /* ! CONFIG_ARM64_SVE */ +static void kernel_neon_disable(void); +static void kernel_neon_enable(void); I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE context access for the current context (i.e., makes it safe). Since these also disable/enable preemption, perhaps we can align them with the existing get_cpu()/put_cpu(), something like: void get_cpu_fpsimd_context(); vold put_cpu_fpsimd_context(); If you consider it's worth adding the checking helper I alluded to above, it could then be called something like: bool have_cpu_fpsimd_context(); I am not sure where you suggested a checking helper above. Do you refer to exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON? + /* * Call __sve_free() directly only if you know task can't be scheduled * or preempted. @@ -213,11 +217,11 @@ static void sve_free(struct task_struct *task) * thread_struct is known to be up to date, when preparing to enter * userspace. * - * Softirqs (and preemption) must be disabled. + * Preemption must be disabled. [*] That's not enough: we need to be in kernel_neon_disable()... _enable() (i.e., kernel_neon_busy needs to be true to prevent softirqs messing with the FPSIMD state). How about not mentioning preemption at all and just say: "The fpsimd context should be acquired before hand". This would help if we ever decide to protect critical section differently. */ static void task_fpsimd_load(void) { - WARN_ON(!in_softirq() && !irqs_disabled()); +
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote: > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of > the kernel may be able to use FPSIMD/SVE. This is for instance the case > for crypto code. > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the > function kernel_neon_{begin, end}. Furthermore, this can only be used > when may_use_simd() returns true. > > The current implementation of may_use_simd() allows softirq to use > FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true). > When in used, softirqs usually fallback to a software method. > > At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled > when touching the FPSIMD/SVE context. This has the drawback to disable > all softirqs even if they are not using FPSIMD/SVE. > > As a softirq should not rely on been able to use simd at a given time, > there are limited reason to keep softirq disabled when touching the > FPSIMD/SVE context. Instead, we can only disable preemption and tell > the NEON unit is currently in use. > > This patch introduces two new helpers kernel_neon_{disable, enable} to > mark the area using FPSIMD/SVE context and use them in replacement of > local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are > also re-implemented to use the new helpers. > > Signed-off-by: Julien Grall > > --- > > I have been exploring this solution as an alternative approach to the RT > patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()". > > So far, the patch has only been lightly tested. > > For RT-linux, it might be possible to use migrate_{enable, disable}. I > am quite new with RT and have some trouble to understand the semantics > of migrate_{enable, disable}. So far, I am still unsure if it is possible > to run another userspace task on the same CPU while getting preempted > when the migration is disabled. (Leaving aside the RT aspects for now, but if migrate_{enable,disable} is costly it might not be the best thing to use deep in context switch paths, even if is technically correct...) > --- > arch/arm64/include/asm/simd.h | 4 +-- > arch/arm64/kernel/fpsimd.c| 76 > +-- > 2 files changed, 46 insertions(+), 34 deletions(-) > > diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h > index 6495cc51246f..94c0dac508aa 100644 > --- a/arch/arm64/include/asm/simd.h > +++ b/arch/arm64/include/asm/simd.h > @@ -15,10 +15,10 @@ > #include > #include > > -#ifdef CONFIG_KERNEL_MODE_NEON > - > DECLARE_PER_CPU(bool, kernel_neon_busy); Why make this unconditional? This declaration is only here for may_use_simd() to use. The stub version of may_use_simd() for the !CONFIG_KERNEL_MODE_NEON case doesn't touch it. > > +#ifdef CONFIG_KERNEL_MODE_NEON > + > /* > * may_use_simd - whether it is allowable at this time to issue SIMD > *instructions or access the SIMD register file > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > index 5ebe73b69961..b7e5dac26190 100644 > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -90,7 +90,8 @@ > * To prevent this from racing with the manipulation of the task's FPSIMD > state > * from task context and thereby corrupting the state, it is necessary to > * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE > - * flag with local_bh_disable() unless softirqs are already masked. > + * flag with kernel_neon_{disable, enable}. This will still allow softirqs to > + * run but prevent them to use FPSIMD. > * > * For a certain task, the sequence may look something like this: > * - the task gets scheduled in; if both the task's fpsimd_cpu field > @@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state; > > #endif /* ! CONFIG_ARM64_SVE */ > > +static void kernel_neon_disable(void); > +static void kernel_neon_enable(void); I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE context access for the current context (i.e., makes it safe). Since these also disable/enable preemption, perhaps we can align them with the existing get_cpu()/put_cpu(), something like: void get_cpu_fpsimd_context(); vold put_cpu_fpsimd_context(); If you consider it's worth adding the checking helper I alluded to above, it could then be called something like: bool have_cpu_fpsimd_context(); > + > /* > * Call __sve_free() directly only if you know task can't be scheduled > * or preempted. > @@ -213,11 +217,11 @@ static void sve_free(struct task_struct *task) > * thread_struct is known to be up to date, when preparing to enter > * userspace. > * > - * Softirqs (and preemption) must be disabled. > + * Preemption must be disabled. [*] That's not enough: we need to be in kernel_neon_disable()... _enable() (i.e., kernel_neon_busy needs to be true to prevent softirqs messing with the FPSIMD state). > */ > static void task_f
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On 15/03/2019 10:06, Dave Martin wrote: On Thu, Mar 14, 2019 at 06:07:19PM +, Julien Grall wrote: Hi Sebastian, On 3/4/19 5:25 PM, Sebastian Andrzej Siewior wrote: [...] It would increase the softirq latency but the question is how bad would it be. It would continue once the SIMD section is done. On Arm, the kernel may use either FPSIMD or SVE (if supported by the platform). While the FPSIMD context is fairly small (~4K), the SVE context can be up to ~64KB. It's not quite as bad as that. The FPSIMD state is ~0.5K, with the SVE state size being up to ~8.5K (though for today's implementations ~2K may be considered typical). Whoops, I forgot to divide by 8. Thank you for spotting it! Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On Thu, Mar 14, 2019 at 06:07:19PM +, Julien Grall wrote: > Hi Sebastian, > > On 3/4/19 5:25 PM, Sebastian Andrzej Siewior wrote: [...] > >It would increase the softirq latency but the question is how bad would > >it be. It would continue once the SIMD section is done. > > On Arm, the kernel may use either FPSIMD or SVE (if supported by the > platform). While the FPSIMD context is fairly small (~4K), the SVE context > can be up to ~64KB. It's not quite as bad as that. The FPSIMD state is ~0.5K, with the SVE state size being up to ~8.5K (though for today's implementations ~2K may be considered typical). For comparision, I believe AVX-512 has ~2K of state. Cheers ---Dave
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi Sebastian, On 3/4/19 5:25 PM, Sebastian Andrzej Siewior wrote: On 2019-02-18 15:07:51 [+], Julien Grall wrote: Hi, Hi, Wouldn't this arbitrarily increase softirq latency? Unconditionally forbidding SIMD in softirq might make more sense. It depends on how important the use cases are... It would increase the softirq latency but the question is how bad would it be. It would continue once the SIMD section is done. On Arm, the kernel may use either FPSIMD or SVE (if supported by the platform). While the FPSIMD context is fairly small (~4K), the SVE context can be up to ~64KB. If you allow it but made it impossible to use (and use the software fallback) then it would slow down the processing. So… This is a fair point. However, the use of crypto in softirqs seem to be limited. So I am wondering whether disabling softirq in all the case is worth it. Would it be possible to consider to forbid/warn about using crypto in softirqs? Looking at the commit message from cb84d11e1625 "arm64: neon: Remove support for nested or hardirq kernel-mode NEON", one of the use case for crypto in softirq is certain mac80211 drivers. Is there any other use case for use crypto in softirqs? mac80211 does it for some wifi drivers. There used to be IPsec but I *think* this moved to the "parallel processing kthread". I was able to find my way through mac80211 and confirm the use a taslket and therefore softirqs. However, I got lost in the ipsec code. During my FPU rework on x86 I didn't find anything that does the processing in softirq (on my machine) so I hacked something so that I could test that I didn't break anything… This is the same on the platform I have been using for testing. Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On 2019-02-18 15:07:51 [+], Julien Grall wrote: > Hi, Hi, > > Wouldn't this arbitrarily increase softirq latency? Unconditionally > > forbidding SIMD in softirq might make more sense. It depends on how > > important the use cases are... It would increase the softirq latency but the question is how bad would it be. It would continue once the SIMD section is done. If you allow it but made it impossible to use (and use the software fallback) then it would slow down the processing. So… > Looking at the commit message from cb84d11e1625 "arm64: neon: Remove support > for nested or hardirq kernel-mode NEON", one of the use case for crypto in > softirq is certain mac80211 drivers. > > Is there any other use case for use crypto in softirqs? mac80211 does it for some wifi drivers. There used to be IPsec but I *think* this moved to the "parallel processing kthread". During my FPU rework on x86 I didn't find anything that does the processing in softirq (on my machine) so I hacked something so that I could test that I didn't break anything… > Cheers, > Sebastian
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi all, On 08/02/2019 16:55, Julien Grall wrote: @@ -1094,9 +1113,7 @@ void kernel_neon_begin(void) /* Invalidate any task state remaining in the fpsimd regs: */ fpsimd_flush_cpu_state(); - preempt_disable(); - - local_bh_enable(); + kernel_neon_enable(); I found one error in the patch. We should not re-enable NEON from kernel_neon_begin(). I will resend a patch once the thread settle down. Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hi, On 14/02/2019 10:34, Dave Martin wrote: On Wed, Feb 13, 2019 at 05:52:27PM +0100, Sebastian Andrzej Siewior wrote: On 2019-02-13 16:40:00 [+0100], Ard Biesheuvel wrote: This is equal what x86 is currently doing. The naming is slightly different, there is irq_fpu_usable(). Yes, I think it's basically the same idea. It's been evolving a bit on both sides, but is quite similar now. may_use_simd() only exists because we have a generic crypto SIMD helper, and so we needed something arch agnostic to wrap around irq_fpu_usable() My question was more if this is helpful and we want to keep or if it would be better to remove it and always disable BH as part of SIMD operations. Wouldn't this arbitrarily increase softirq latency? Unconditionally forbidding SIMD in softirq might make more sense. It depends on how important the use cases are... Looking at the commit message from cb84d11e1625 "arm64: neon: Remove support for nested or hardirq kernel-mode NEON", one of the use case for crypto in softirq is certain mac80211 drivers. Is there any other use case for use crypto in softirqs? Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hello Sebastian, On 13/02/2019 14:30, Sebastian Andrzej Siewior wrote: On 2019-02-08 16:55:13 [+], Julien Grall wrote: When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of the kernel may be able to use FPSIMD/SVE. This is for instance the case for crypto code. Any use of FPSIMD/SVE in the kernel are clearly marked by using the function kernel_neon_{begin, end}. Furthermore, this can only be used when may_use_simd() returns true. This is equal what x86 is currently doing. The naming is slightly different, there is irq_fpu_usable(). The idea behind the patch was taken from x86. On x86, softirq does not seem to be disabled when context switching the FPUs registers. The current implementation of may_use_simd() allows softirq to use FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true). When in used, softirqs usually fallback to a software method. At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled when touching the FPSIMD/SVE context. This has the drawback to disable all softirqs even if they are not using FPSIMD/SVE. Is this bad? This means also that your crypto code will not be interrupted by a softirq. Also if you would get rid of it, you could avoid the software fallback in case may_use_simd() says false. There seem to have some misunderstanding about the purpose of this patch. Any use of crypto in the kernel is only protected by preempt_disable(). softirqs are only disabled when context switching the FPU register between two userspace tasks. From the commit log cb84d11e1625 "arm64: neon: Remove support for nested or hardirq kernel-mode NEON" this was done to protect against rare softirqs use crypto. It seems to me this is a bit overkill to increase softirq latency if they barely use FPSIMD/SVE. Indeed, the SVE context can be quite large, therefore it can take some times to save/restore it. [...] For RT-linux, it might be possible to use migrate_{enable, disable}. I am quite new with RT and have some trouble to understand the semantics of migrate_{enable, disable}. So far, I am still unsure if it is possible to run another userspace task on the same CPU while getting preempted when the migration is disabled. In RT: - preemt_disable() is the same as !RT. A thread can not be suspend. An interrupt may interrupt. However on RT we have threaded interrupts so the interrupt is limited to the first-level handler (not the threaded handler). - migrate_disable() means that the thread can not be moved to another CPU. It can be suspended. - local_bh_disable() disables the BH: No softirq can run. In RT local_bh_disable() does not inherit preempt_disable(). Two different softirqs can be executed in parallel. The BH is usually invoked at the end of the threaded interrupt (because the threaded interrupt handler raises the softirq). It can also run in the ksoftirqd. Usually you should not get preempted in a migrate_disable() section. A SCHED_OTHER task should not interrupt another SCHED_OTHER task in a migrate_disable() section. A task with a higher priority (a RT/DL task) will. Since threaded interrupts run with a RT priority of 50, they will interrupt your task in a migrate_disable() section. Thank you for the explanation! Cheers, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On 12/02/2019 17:13, Julia Cartwright wrote: Hello Julien- Hello Julien, On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote: When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of the kernel may be able to use FPSIMD/SVE. This is for instance the case for crypto code. Any use of FPSIMD/SVE in the kernel are clearly marked by using the function kernel_neon_{begin, end}. Furthermore, this can only be used when may_use_simd() returns true. The current implementation of may_use_simd() allows softirq to use FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true). When in used, softirqs usually fallback to a software method. At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled when touching the FPSIMD/SVE context. This has the drawback to disable all softirqs even if they are not using FPSIMD/SVE. As a softirq should not rely on been able to use simd at a given time, there are limited reason to keep softirq disabled when touching the FPSIMD/SVE context. Instead, we can only disable preemption and tell the NEON unit is currently in use. This patch introduces two new helpers kernel_neon_{disable, enable} to mark the area using FPSIMD/SVE context and use them in replacement of local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are also re-implemented to use the new helpers. Signed-off-by: Julien Grall --- I have been exploring this solution as an alternative approach to the RT patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()". So far, the patch has only been lightly tested. For RT-linux, it might be possible to use migrate_{enable, disable}. I am quite new with RT and have some trouble to understand the semantics of migrate_{enable, disable}. So far, I am still unsure if it is possible to run another userspace task on the same CPU while getting preempted when the migration is disabled. Yes, a thread in a migrate_disable()-protected critical section can be preempted, and another thread on the same CPU may enter the critical section. If it's necessary to prevent this from happening, then you need to also make use of a per-CPU mutex. On RT, this would do the right thing w.r.t. priority inheritance. Thank you for the explanation, I now understand better the concept of migrate_disable. Best regards, -- Julien Grall
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On Wed, Feb 13, 2019 at 05:52:27PM +0100, Sebastian Andrzej Siewior wrote: > On 2019-02-13 16:40:00 [+0100], Ard Biesheuvel wrote: > > > > This is equal what x86 is currently doing. The naming is slightly > > > > different, there is irq_fpu_usable(). > > > > > > Yes, I think it's basically the same idea. > > > > > > It's been evolving a bit on both sides, but is quite similar now. > > > > > > > may_use_simd() only exists because we have a generic crypto SIMD > > helper, and so we needed something arch agnostic to wrap around > > irq_fpu_usable() > > My question was more if this is helpful and we want to keep or if > it would be better to remove it and always disable BH as part of SIMD > operations. Wouldn't this arbitrarily increase softirq latency? Unconditionally forbidding SIMD in softirq might make more sense. It depends on how important the use cases are... Cheers ---Dave
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On 2019-02-13 16:40:00 [+0100], Ard Biesheuvel wrote: > > > This is equal what x86 is currently doing. The naming is slightly > > > different, there is irq_fpu_usable(). > > > > Yes, I think it's basically the same idea. > > > > It's been evolving a bit on both sides, but is quite similar now. > > > > may_use_simd() only exists because we have a generic crypto SIMD > helper, and so we needed something arch agnostic to wrap around > irq_fpu_usable() My question was more if this is helpful and we want to keep or if it would be better to remove it and always disable BH as part of SIMD operations. Sebastian
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On 2019-02-13 15:36:30 [+], Dave Martin wrote: > On Wed, Feb 13, 2019 at 03:30:29PM +0100, Sebastian Andrzej Siewior wrote: > > On 2019-02-08 16:55:13 [+], Julien Grall wrote: > > > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of > > > the kernel may be able to use FPSIMD/SVE. This is for instance the case > > > for crypto code. > > > > > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the > > > function kernel_neon_{begin, end}. Furthermore, this can only be used > > > when may_use_simd() returns true. > > > > This is equal what x86 is currently doing. The naming is slightly > > different, there is irq_fpu_usable(). > > Yes, I think it's basically the same idea. > > It's been evolving a bit on both sides, but is quite similar now. I though that this is complicated and wanted to submit a patch to remove irq_fpu_usable() and disable BH as part of kernel_fpu_begin() (but have currently the onother FPU series ongoing which I want to finish first). … > "Usually" is probably not good enough if another task can run: if the > preempting task enters userspace then the vector registers are needed > for its use, which is tricky to arrange if the registers are currently > in use by context switch logic running in the first task. Yes. > My current feeling is that we probably have to stick with > preempt_disable() here, but hopefully we can get rid of > local_bh_disable() (as proposed) with no ill effects... > > Does that sound sensible? If you want to stick with may_use_simd() then yes. > Cheers > ---Dave Sebastian
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On Wed, Feb 13, 2019 at 04:40:00PM +0100, Ard Biesheuvel wrote: > On Wed, 13 Feb 2019 at 16:36, Dave Martin wrote: > > > > On Wed, Feb 13, 2019 at 03:30:29PM +0100, Sebastian Andrzej Siewior wrote: > > > On 2019-02-08 16:55:13 [+], Julien Grall wrote: > > > > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of > > > > the kernel may be able to use FPSIMD/SVE. This is for instance the case > > > > for crypto code. > > > > > > > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the > > > > function kernel_neon_{begin, end}. Furthermore, this can only be used > > > > when may_use_simd() returns true. > > > > > > This is equal what x86 is currently doing. The naming is slightly > > > different, there is irq_fpu_usable(). > > > > Yes, I think it's basically the same idea. > > > > It's been evolving a bit on both sides, but is quite similar now. > > > > may_use_simd() only exists because we have a generic crypto SIMD > helper, and so we needed something arch agnostic to wrap around > irq_fpu_usable() [...] Sounds plausible. Cheers ---Dave
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On Wed, 13 Feb 2019 at 16:36, Dave Martin wrote: > > On Wed, Feb 13, 2019 at 03:30:29PM +0100, Sebastian Andrzej Siewior wrote: > > On 2019-02-08 16:55:13 [+], Julien Grall wrote: > > > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of > > > the kernel may be able to use FPSIMD/SVE. This is for instance the case > > > for crypto code. > > > > > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the > > > function kernel_neon_{begin, end}. Furthermore, this can only be used > > > when may_use_simd() returns true. > > > > This is equal what x86 is currently doing. The naming is slightly > > different, there is irq_fpu_usable(). > > Yes, I think it's basically the same idea. > > It's been evolving a bit on both sides, but is quite similar now. > may_use_simd() only exists because we have a generic crypto SIMD helper, and so we needed something arch agnostic to wrap around irq_fpu_usable() > > > The current implementation of may_use_simd() allows softirq to use > > > FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true). > > > When in used, softirqs usually fallback to a software method. > > > > > > At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled > > > when touching the FPSIMD/SVE context. This has the drawback to disable > > > all softirqs even if they are not using FPSIMD/SVE. > > > > Is this bad? This means also that your crypto code will not be > > interrupted by a softirq. Also if you would get rid of it, you could > > avoid the software fallback in case may_use_simd() says false. > > Masking softirqs during kernel_neon_begin()..._end() is unlikely to be a > huge problem, but currently we block softirq during all context switch > operations that act on the CPU vector registers. > > The reasons for this are somewhat historical, and IIRC predated the > requirement for softirq users of kernel-mode NEON to include the > may_use_simd() check and implement a fallback path on arm64. > > Now that softirq code is required to work around kernel-mode NEON being > temporarily unusable, masking softirqs completely during context switch > etc. should no longer be necessary. > > > > As a softirq should not rely on been able to use simd at a given time, > > > there are limited reason to keep softirq disabled when touching the > > > FPSIMD/SVE context. Instead, we can only disable preemption and tell > > > the NEON unit is currently in use. > > > > > > This patch introduces two new helpers kernel_neon_{disable, enable} to > > > mark the area using FPSIMD/SVE context and use them in replacement of > > > local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are > > > also re-implemented to use the new helpers. > > > > > > Signed-off-by: Julien Grall > > > > > > --- > > > > > > I have been exploring this solution as an alternative approach to the RT > > > patch "arm64: fpsimd: use preemp_disable in addition to > > > local_bh_disable()". > > > > > > So far, the patch has only been lightly tested. > > > > > > For RT-linux, it might be possible to use migrate_{enable, disable}. I > > > am quite new with RT and have some trouble to understand the semantics > > > of migrate_{enable, disable}. So far, I am still unsure if it is possible > > > to run another userspace task on the same CPU while getting preempted > > > when the migration is disabled. > > > > In RT: > > - preemt_disable() is the same as !RT. A thread can not be suspend. An > > interrupt may interrupt. However on RT we have threaded interrupts so > > the interrupt is limited to the first-level handler (not the threaded > > handler). > > > > - migrate_disable() means that the thread can not be moved to another > > CPU. It can be suspended. > > > > - local_bh_disable() disables the BH: No softirq can run. In RT > > local_bh_disable() does not inherit preempt_disable(). Two different > > softirqs can be executed in parallel. > > The BH is usually invoked at the end of the threaded interrupt > > (because the threaded interrupt handler raises the softirq). It can > > also run in the ksoftirqd. > > > > Usually you should not get preempted in a migrate_disable() section. A > > SCHED_OTHER task should not interrupt another SCHED_OTHER task in a > > migrate_disable() section. A task with a higher priority (a RT/DL task) > > will. Since threaded interrupts run with a RT priority of 50, they will > > interrupt your task in a migrate_disable() section. > > "Usually" is probably not good enough if another task can run: if the > preempting task enters userspace then the vector registers are needed > for its use, which is tricky to arrange if the registers are currently > in use by context switch logic running in the first task. > > My current feeling is that we probably have to stick with > preempt_disable() here, but hopefully we can get rid of > local_bh_disable() (as proposed) with no ill effects... > > Does that sound sensible? > > Cheers > ---Dave
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On Wed, Feb 13, 2019 at 03:30:29PM +0100, Sebastian Andrzej Siewior wrote: > On 2019-02-08 16:55:13 [+], Julien Grall wrote: > > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of > > the kernel may be able to use FPSIMD/SVE. This is for instance the case > > for crypto code. > > > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the > > function kernel_neon_{begin, end}. Furthermore, this can only be used > > when may_use_simd() returns true. > > This is equal what x86 is currently doing. The naming is slightly > different, there is irq_fpu_usable(). Yes, I think it's basically the same idea. It's been evolving a bit on both sides, but is quite similar now. > > The current implementation of may_use_simd() allows softirq to use > > FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true). > > When in used, softirqs usually fallback to a software method. > > > > At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled > > when touching the FPSIMD/SVE context. This has the drawback to disable > > all softirqs even if they are not using FPSIMD/SVE. > > Is this bad? This means also that your crypto code will not be > interrupted by a softirq. Also if you would get rid of it, you could > avoid the software fallback in case may_use_simd() says false. Masking softirqs during kernel_neon_begin()..._end() is unlikely to be a huge problem, but currently we block softirq during all context switch operations that act on the CPU vector registers. The reasons for this are somewhat historical, and IIRC predated the requirement for softirq users of kernel-mode NEON to include the may_use_simd() check and implement a fallback path on arm64. Now that softirq code is required to work around kernel-mode NEON being temporarily unusable, masking softirqs completely during context switch etc. should no longer be necessary. > > As a softirq should not rely on been able to use simd at a given time, > > there are limited reason to keep softirq disabled when touching the > > FPSIMD/SVE context. Instead, we can only disable preemption and tell > > the NEON unit is currently in use. > > > > This patch introduces two new helpers kernel_neon_{disable, enable} to > > mark the area using FPSIMD/SVE context and use them in replacement of > > local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are > > also re-implemented to use the new helpers. > > > > Signed-off-by: Julien Grall > > > > --- > > > > I have been exploring this solution as an alternative approach to the RT > > patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()". > > > > So far, the patch has only been lightly tested. > > > > For RT-linux, it might be possible to use migrate_{enable, disable}. I > > am quite new with RT and have some trouble to understand the semantics > > of migrate_{enable, disable}. So far, I am still unsure if it is possible > > to run another userspace task on the same CPU while getting preempted > > when the migration is disabled. > > In RT: > - preemt_disable() is the same as !RT. A thread can not be suspend. An > interrupt may interrupt. However on RT we have threaded interrupts so > the interrupt is limited to the first-level handler (not the threaded > handler). > > - migrate_disable() means that the thread can not be moved to another > CPU. It can be suspended. > > - local_bh_disable() disables the BH: No softirq can run. In RT > local_bh_disable() does not inherit preempt_disable(). Two different > softirqs can be executed in parallel. > The BH is usually invoked at the end of the threaded interrupt > (because the threaded interrupt handler raises the softirq). It can > also run in the ksoftirqd. > > Usually you should not get preempted in a migrate_disable() section. A > SCHED_OTHER task should not interrupt another SCHED_OTHER task in a > migrate_disable() section. A task with a higher priority (a RT/DL task) > will. Since threaded interrupts run with a RT priority of 50, they will > interrupt your task in a migrate_disable() section. "Usually" is probably not good enough if another task can run: if the preempting task enters userspace then the vector registers are needed for its use, which is tricky to arrange if the registers are currently in use by context switch logic running in the first task. My current feeling is that we probably have to stick with preempt_disable() here, but hopefully we can get rid of local_bh_disable() (as proposed) with no ill effects... Does that sound sensible? Cheers ---Dave
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
On 2019-02-08 16:55:13 [+], Julien Grall wrote: > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of > the kernel may be able to use FPSIMD/SVE. This is for instance the case > for crypto code. > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the > function kernel_neon_{begin, end}. Furthermore, this can only be used > when may_use_simd() returns true. This is equal what x86 is currently doing. The naming is slightly different, there is irq_fpu_usable(). > The current implementation of may_use_simd() allows softirq to use > FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true). > When in used, softirqs usually fallback to a software method. > > At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled > when touching the FPSIMD/SVE context. This has the drawback to disable > all softirqs even if they are not using FPSIMD/SVE. Is this bad? This means also that your crypto code will not be interrupted by a softirq. Also if you would get rid of it, you could avoid the software fallback in case may_use_simd() says false. > As a softirq should not rely on been able to use simd at a given time, > there are limited reason to keep softirq disabled when touching the > FPSIMD/SVE context. Instead, we can only disable preemption and tell > the NEON unit is currently in use. > > This patch introduces two new helpers kernel_neon_{disable, enable} to > mark the area using FPSIMD/SVE context and use them in replacement of > local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are > also re-implemented to use the new helpers. > > Signed-off-by: Julien Grall > > --- > > I have been exploring this solution as an alternative approach to the RT > patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()". > > So far, the patch has only been lightly tested. > > For RT-linux, it might be possible to use migrate_{enable, disable}. I > am quite new with RT and have some trouble to understand the semantics > of migrate_{enable, disable}. So far, I am still unsure if it is possible > to run another userspace task on the same CPU while getting preempted > when the migration is disabled. In RT: - preemt_disable() is the same as !RT. A thread can not be suspend. An interrupt may interrupt. However on RT we have threaded interrupts so the interrupt is limited to the first-level handler (not the threaded handler). - migrate_disable() means that the thread can not be moved to another CPU. It can be suspended. - local_bh_disable() disables the BH: No softirq can run. In RT local_bh_disable() does not inherit preempt_disable(). Two different softirqs can be executed in parallel. The BH is usually invoked at the end of the threaded interrupt (because the threaded interrupt handler raises the softirq). It can also run in the ksoftirqd. Usually you should not get preempted in a migrate_disable() section. A SCHED_OTHER task should not interrupt another SCHED_OTHER task in a migrate_disable() section. A task with a higher priority (a RT/DL task) will. Since threaded interrupts run with a RT priority of 50, they will interrupt your task in a migrate_disable() section. Sebastian
Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
Hello Julien- On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote: > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of > the kernel may be able to use FPSIMD/SVE. This is for instance the case > for crypto code. > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the > function kernel_neon_{begin, end}. Furthermore, this can only be used > when may_use_simd() returns true. > > The current implementation of may_use_simd() allows softirq to use > FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true). > When in used, softirqs usually fallback to a software method. > > At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled > when touching the FPSIMD/SVE context. This has the drawback to disable > all softirqs even if they are not using FPSIMD/SVE. > > As a softirq should not rely on been able to use simd at a given time, > there are limited reason to keep softirq disabled when touching the > FPSIMD/SVE context. Instead, we can only disable preemption and tell > the NEON unit is currently in use. > > This patch introduces two new helpers kernel_neon_{disable, enable} to > mark the area using FPSIMD/SVE context and use them in replacement of > local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are > also re-implemented to use the new helpers. > > Signed-off-by: Julien Grall > > --- > > I have been exploring this solution as an alternative approach to the RT > patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()". > > So far, the patch has only been lightly tested. > > For RT-linux, it might be possible to use migrate_{enable, disable}. I > am quite new with RT and have some trouble to understand the semantics > of migrate_{enable, disable}. So far, I am still unsure if it is possible > to run another userspace task on the same CPU while getting preempted > when the migration is disabled. Yes, a thread in a migrate_disable()-protected critical section can be preempted, and another thread on the same CPU may enter the critical section. If it's necessary to prevent this from happening, then you need to also make use of a per-CPU mutex. On RT, this would do the right thing w.r.t. priority inheritance. Julia