Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-11 Thread Julien Grall

Hi Dave,

On 4/11/19 5:34 PM, Dave Martin wrote:

On Thu, Apr 11, 2019 at 04:58:41PM +0100, Julien Grall wrote:

Hi Dave,

On 4/5/19 4:07 PM, Dave Martin wrote:

On Fri, Apr 05, 2019 at 10:02:45AM +0100, Julien Grall wrote:

+#ifdef CONFIG_KERNEL_MODE_NEON
+
  /*
   * may_use_simd - whether it is allowable at this time to issue SIMD
   *instructions or access the SIMD register file
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 5ebe73b69961..b7e5dac26190 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -90,7 +90,8 @@
   * To prevent this from racing with the manipulation of the task's FPSIMD 
state
   * from task context and thereby corrupting the state, it is necessary to
   * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE
- * flag with local_bh_disable() unless softirqs are already masked.
+ * flag with kernel_neon_{disable, enable}. This will still allow softirqs to
+ * run but prevent them to use FPSIMD.
   *
   * For a certain task, the sequence may look something like this:
   * - the task gets scheduled in; if both the task's fpsimd_cpu field
@@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state;
  #endif /* ! CONFIG_ARM64_SVE */
+static void kernel_neon_disable(void);
+static void kernel_neon_enable(void);


I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE
context access for the current context (i.e., makes it safe).

Since these also disable/enable preemption, perhaps we can align them
with the existing get_cpu()/put_cpu(), something like:

void get_cpu_fpsimd_context();
vold put_cpu_fpsimd_context();

If you consider it's worth adding the checking helper I alluded to
above, it could then be called something like:

bool have_cpu_fpsimd_context();


I am not sure where you suggested a checking helper above. Do you refer to
exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON?


Hmmm, looks like I got my reply out of order here.

I meant the helper (if any) to check
!preemptible() && !__this_cpu_read(kernel_neon_busy).


I guess you are using && instead of || because some of the caller may not
call get_cpu_fpsimd_context() before but still disable preemption, right?

Wouldn't it be better to have all the user calling get_cpu_fpsimd_context()
and put_cpu_fpsimd_context()?

This has the advantage to uniformize how the way FPSIMD is protected and
also...


My expectation is that all users would have called
get_cpu_fpsimd_context().


This is not the case today (see kvm_arch_vcpu_put_fp), I will look at 
protecting it with a call to get_cpu_fpsimd_context().




The reason for writing the check that way is that we can't meaningfully
inspect percpu variables unless we are non-preemptible already.  The &&
means we don't do the percpu read at all is the case where preemptible()
is true.


I am not sure to understand why it would be necessary. 
this_cpu_read(kernel_neon_busy) should be sufficient here.


If it is set then, preemption is disabled. Or are you worried about user 
directly setting kernel_neon_busy instead of calling get_cpu_fpsimd_context?




Or do you think my logic is wrong somewhere?  (It's possible...)


I think your logic would not return the correct value. We want 
have_cpu_fpsimd_context() to return true if it is not preemptible and 
kernel_neon_busy is true. So we would want:


!preemptible() && __this_cpu_read(kernel_neon_busy)

If we speak about the implementation of have_cpu_fpsimd_context(), then 
we want:


!preemptible() && __this_cpu_read(kernel_neon_busy)

Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-11 Thread Dave Martin
On Thu, Apr 11, 2019 at 04:58:41PM +0100, Julien Grall wrote:
> Hi Dave,
> 
> On 4/5/19 4:07 PM, Dave Martin wrote:
> >On Fri, Apr 05, 2019 at 10:02:45AM +0100, Julien Grall wrote:
> +#ifdef CONFIG_KERNEL_MODE_NEON
> +
>   /*
>    * may_use_simd - whether it is allowable at this time to issue SIMD
>    *instructions or access the SIMD register file
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 5ebe73b69961..b7e5dac26190 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -90,7 +90,8 @@
>    * To prevent this from racing with the manipulation of the task's 
>  FPSIMD state
>    * from task context and thereby corrupting the state, it is necessary 
>  to
>    * protect any manipulation of a task's fpsimd_state or 
>  TIF_FOREIGN_FPSTATE
> - * flag with local_bh_disable() unless softirqs are already masked.
> + * flag with kernel_neon_{disable, enable}. This will still allow 
> softirqs to
> + * run but prevent them to use FPSIMD.
>    *
>    * For a certain task, the sequence may look something like this:
>    * - the task gets scheduled in; if both the task's fpsimd_cpu field
> @@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state;
>   #endif /* ! CONFIG_ARM64_SVE */
> +static void kernel_neon_disable(void);
> +static void kernel_neon_enable(void);
> >>>
> >>>I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE
> >>>context access for the current context (i.e., makes it safe).
> >>>
> >>>Since these also disable/enable preemption, perhaps we can align them
> >>>with the existing get_cpu()/put_cpu(), something like:
> >>>
> >>>void get_cpu_fpsimd_context();
> >>>vold put_cpu_fpsimd_context();
> >>>
> >>>If you consider it's worth adding the checking helper I alluded to
> >>>above, it could then be called something like:
> >>>
> >>>bool have_cpu_fpsimd_context();
> >>
> >>I am not sure where you suggested a checking helper above. Do you refer to
> >>exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON?
> >
> >Hmmm, looks like I got my reply out of order here.
> >
> >I meant the helper (if any) to check
> >!preemptible() && !__this_cpu_read(kernel_neon_busy).
> 
> I guess you are using && instead of || because some of the caller may not
> call get_cpu_fpsimd_context() before but still disable preemption, right?
> 
> Wouldn't it be better to have all the user calling get_cpu_fpsimd_context()
> and put_cpu_fpsimd_context()?
> 
> This has the advantage to uniformize how the way FPSIMD is protected and
> also...

My expectation is that all users would have called
get_cpu_fpsimd_context().

The reason for writing the check that way is that we can't meaningfully
inspect percpu variables unless we are non-preemptible already.  The &&
means we don't do the percpu read at all is the case where preemptible()
is true.

Or do you think my logic is wrong somewhere?  (It's possible...)

> >Looks like you inferred what I meant later on anyway.
> >
> >>
> >>>
> +
>   /*
>    * Call __sve_free() directly only if you know task can't be scheduled
>    * or preempted.
> @@ -213,11 +217,11 @@ static void sve_free(struct task_struct *task)
>    * thread_struct is known to be up to date, when preparing to enter
>    * userspace.
>    *
> - * Softirqs (and preemption) must be disabled.
> + * Preemption must be disabled.
> >>>
> >>>[*] That's not enough: we need to be in kernel_neon_disable()...
> >>>_enable() (i.e., kernel_neon_busy needs to be true to prevent softirqs
> >>>messing with the FPSIMD state).
> >>
> >>How about not mentioning preemption at all and just say:
> >>
> >>"The fpsimd context should be acquired before hand".
> >>
> >>This would help if we ever decide to protect critical section differently.
> >
> >Yes, or even better, name the function used to do this (i.e.,
> >kernel_neon_disable() or get_cpu_fpsimd_context() or whatever it's
> >called).
> 
> ... would make the comments simpler because we would have only one possible
> case to care.

Agreed

---Dave


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-11 Thread Julien Grall

Hi Dave,

On 4/5/19 4:07 PM, Dave Martin wrote:

On Fri, Apr 05, 2019 at 10:02:45AM +0100, Julien Grall wrote:

+#ifdef CONFIG_KERNEL_MODE_NEON
+
  /*
   * may_use_simd - whether it is allowable at this time to issue SIMD
   *instructions or access the SIMD register file
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 5ebe73b69961..b7e5dac26190 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -90,7 +90,8 @@
   * To prevent this from racing with the manipulation of the task's FPSIMD 
state
   * from task context and thereby corrupting the state, it is necessary to
   * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE
- * flag with local_bh_disable() unless softirqs are already masked.
+ * flag with kernel_neon_{disable, enable}. This will still allow softirqs to
+ * run but prevent them to use FPSIMD.
   *
   * For a certain task, the sequence may look something like this:
   * - the task gets scheduled in; if both the task's fpsimd_cpu field
@@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state;
  #endif /* ! CONFIG_ARM64_SVE */
+static void kernel_neon_disable(void);
+static void kernel_neon_enable(void);


I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE
context access for the current context (i.e., makes it safe).

Since these also disable/enable preemption, perhaps we can align them
with the existing get_cpu()/put_cpu(), something like:

void get_cpu_fpsimd_context();
vold put_cpu_fpsimd_context();

If you consider it's worth adding the checking helper I alluded to
above, it could then be called something like:

bool have_cpu_fpsimd_context();


I am not sure where you suggested a checking helper above. Do you refer to
exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON?


Hmmm, looks like I got my reply out of order here.

I meant the helper (if any) to check
!preemptible() && !__this_cpu_read(kernel_neon_busy).


I guess you are using && instead of || because some of the caller may 
not call get_cpu_fpsimd_context() before but still disable preemption, 
right?


Wouldn't it be better to have all the user calling 
get_cpu_fpsimd_context() and put_cpu_fpsimd_context()?


This has the advantage to uniformize how the way FPSIMD is protected and 
also...




Looks like you inferred what I meant later on anyway.






+
  /*
   * Call __sve_free() directly only if you know task can't be scheduled
   * or preempted.
@@ -213,11 +217,11 @@ static void sve_free(struct task_struct *task)
   * thread_struct is known to be up to date, when preparing to enter
   * userspace.
   *
- * Softirqs (and preemption) must be disabled.
+ * Preemption must be disabled.


[*] That's not enough: we need to be in kernel_neon_disable()...
_enable() (i.e., kernel_neon_busy needs to be true to prevent softirqs
messing with the FPSIMD state).


How about not mentioning preemption at all and just say:

"The fpsimd context should be acquired before hand".

This would help if we ever decide to protect critical section differently.


Yes, or even better, name the function used to do this (i.e.,
kernel_neon_disable() or get_cpu_fpsimd_context() or whatever it's
called).


... would make the comments simpler because we would have only one 
possible case to care.


Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-11 Thread Julien Grall

Hi Sebastian,

On 4/5/19 4:42 PM, Sebastian Andrzej Siewior wrote:

On 2019-04-05 16:17:50 [+0100], Julien Grall wrote:

Hi,

Hi,


A per-CPU lock? It has to be a raw_spinlock_t because a normal
spin_lock() / local_lock() would allow scheduling and might be taken as
part of the context switch or soon after.

raw_spinlock_t would not work here without disabling preemption. Otherwise
you may end up to recurse on the lock and therefore deadlock. But then it
raise the question of the usefulness of the lock here.

However, I don't really understand why allowing the scheduling would be a
problem here. Is it a concern because we will waste cycle trying to
restore/save a context that will be scratched as soon as we release the
lock?


If you hold the lock within the kernel thread and every kernel thread
acquires it before doing any SIMD operations then you are good. It could
be a sleeping lock. What happens if you hold the lock, are scheduled out
and a user task is about to be scheduled? How do you force the kernel
thread out / give up the FPU registers?


You would need the thread out to finish running the critical section 
before the next thread to run. I was under the impression with sleeping 
lock, the priority of the thread out would get bumped to finish quickly 
the critical section.


I agree it means that you will waste time restoring registers that will 
get trashed right after you leave the critical sections. This is 
probably not really efficient.


Anyway, that was only a suggestion I haven't fully thought through. I 
have no plan to do more work than this patch in the fpsimd context switch.




That preempt_disable() + local_bh_disable() might not be the pretties
thing but how bad is it actually?


From the measurement I did on non-RT, it is beneficial to keep the 
softirq enabled (see [1]).


I don't have platform where FPSIMD is used in softirqs (or at least not 
by default). So for testing purpose, I wrote a tasklet (based on 
hrtimer) using the FPSIMD registers every 1ms if it is allowed (i.e 
may_use_simd() returns true). I let it run for a while and notice that 
the tasklet will be executed only 0.15% of the time when !may_use_simd().


Furthermore, from what I understood in this thread, there are few 
limited use cases where FPSIMD will be used in softirqs. So it seems 
better to me to avoid disabling softirqs at least in non-RT.


For the RT, aren't all the softirqs handled in a thread? So what would 
be the benefits to disable softirqs if we already disable preemption?


In any case, this patch introduces new helpers (get_cpu_fpsimd_context 
and put_cpu_fpsimd_context) to delimit regions using the HW FPSIMD. So 
it would be easy to modify the behavior if we wanted too.



Latency wise you can't schedule(). From RT point of view you need to
enable preemption while going from page to page because of the possible
kmap() or kmalloc() (on baldy aligned src/dst) with the crypto's
page-walk code.


Make sense. However I don't think we can keep enable the preemption with 
the current implementation of FPSIMD context switch. I noticed you 
disabled crypto for Arm64 because of allocations, I have a todo to look 
at what we can do.


Cheers,

[1] https://marc.info/?l=linux-rt-users&m=155499183812211&w=2



Sebastian



--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-11 Thread Julien Grall

Hi Dave,

On 4/4/19 11:52 AM, Dave Martin wrote:

On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote:
I'm not sure how this patch will affect context switch overhead, so it
would be good to see hackbench numbers (or similar).


I finally have some numbers for this patch. The benchmark was ran using 
Linux 5.1-rc4 and defconfig.


On Juno2:
   * hackbench 100 process 1000 (10 times)
   * .7% quicker

On ThunderX 2:
   * hackbench 1000 process 1000 (20 times)
   * 3.4% quicker

I am happy to try other benchmark if you think it is useful.

Anyway, I will resend the series with the comments addressed.

Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-05 Thread Sebastian Andrzej Siewior
On 2019-04-05 16:17:50 [+0100], Julien Grall wrote:
> Hi,
Hi,

> > A per-CPU lock? It has to be a raw_spinlock_t because a normal
> > spin_lock() / local_lock() would allow scheduling and might be taken as
> > part of the context switch or soon after.
> raw_spinlock_t would not work here without disabling preemption. Otherwise
> you may end up to recurse on the lock and therefore deadlock. But then it
> raise the question of the usefulness of the lock here.
> 
> However, I don't really understand why allowing the scheduling would be a
> problem here. Is it a concern because we will waste cycle trying to
> restore/save a context that will be scratched as soon as we release the
> lock?

If you hold the lock within the kernel thread and every kernel thread
acquires it before doing any SIMD operations then you are good. It could
be a sleeping lock. What happens if you hold the lock, are scheduled out
and a user task is about to be scheduled? How do you force the kernel
thread out / give up the FPU registers?

That preempt_disable() + local_bh_disable() might not be the pretties
thing but how bad is it actually?
Latency wise you can't schedule(). From RT point of view you need to
enable preemption while going from page to page because of the possible
kmap() or kmalloc() (on baldy aligned src/dst) with the crypto's
page-walk code.
If that is not good enough latency wise you could do
kernel_fpu_resched() after a few iterations. Currently I'm trying to get
kernel_fpu_begin()/end() cheap on x86 so it doesn't always store/restore
the FPU context. Then kernel_fpu_resched() shouldn't be that bad.

> Cheers,

Sebastian


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-05 Thread Julien Grall

Hi,

On 05/04/2019 15:39, Sebastian Andrzej Siewior wrote:

On 2019-04-05 10:02:45 [+0100], Julien Grall wrote:

RT folks already saw this corruption because local_bh_disable() does not
preempt on RT. They are carrying a patch (see "arm64: fpsimd: use
preemp_disable in addition to local_bh_disable()") to disable preemption
along with local_bh_disable().

Alternatively, Julia suggested to introduce a per-cpu lock to protect the
state. I am thinking to defer this for a follow-up patch. The changes in
this patch should make it easier because we now have helper to mark the
critical section.


A per-CPU lock? It has to be a raw_spinlock_t because a normal
spin_lock() / local_lock() would allow scheduling and might be taken as
part of the context switch or soon after.
raw_spinlock_t would not work here without disabling preemption. Otherwise you 
may end up to recurse on the lock and therefore deadlock. But then it raise the 
question of the usefulness of the lock here.


However, I don't really understand why allowing the scheduling would be a 
problem here. Is it a concern because we will waste cycle trying to restore/save 
a context that will be scratched as soon as we release the lock?


Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-05 Thread Dave Martin
On Fri, Apr 05, 2019 at 10:02:45AM +0100, Julien Grall wrote:
> Hi Dave,
> 
> Thank you for the review.
> 
> On 4/4/19 11:52 AM, Dave Martin wrote:
> >On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote:
> >>For RT-linux, it might be possible to use migrate_{enable, disable}. I
> >>am quite new with RT and have some trouble to understand the semantics
> >>of migrate_{enable, disable}. So far, I am still unsure if it is possible
> >>to run another userspace task on the same CPU while getting preempted
> >>when the migration is disabled.
> >
> >(Leaving aside the RT aspects for now, but if migrate_{enable,disable}
> >is costly it might not be the best thing to use deep in context switch
> >paths, even if is technically correct...)
> 
> Based on the discussion in this thread, migrate_enable/migrate_disable is
> not suitable in this context.
> 
> The goal of those helpers is to pin the task to the current CPU. On RT, it
> will not disable the preemption. So the current task can be preempted by a
> task with higher priority.
> 
> The next task may require to use the FPSIMD and will potentially result to
> corrupt the state.
> 
> RT folks already saw this corruption because local_bh_disable() does not
> preempt on RT. They are carrying a patch (see "arm64: fpsimd: use
> preemp_disable in addition to local_bh_disable()") to disable preemption
> along with local_bh_disable().
> 
> Alternatively, Julia suggested to introduce a per-cpu lock to protect the
> state. I am thinking to defer this for a follow-up patch. The changes in
> this patch should make it easier because we now have helper to mark the
> critical section.

I'll leave it for the RT folks to comment on this.  (I see Sebastian
already did.)

> 
> >
> >>---
> >>  arch/arm64/include/asm/simd.h |  4 +--
> >>  arch/arm64/kernel/fpsimd.c| 76 
> >> +--
> >>  2 files changed, 46 insertions(+), 34 deletions(-)
> >>
> >>diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
> >>index 6495cc51246f..94c0dac508aa 100644
> >>--- a/arch/arm64/include/asm/simd.h
> >>+++ b/arch/arm64/include/asm/simd.h
> >>@@ -15,10 +15,10 @@
> >>  #include 
> >>  #include 
> >>-#ifdef CONFIG_KERNEL_MODE_NEON
> >>-
> >>  DECLARE_PER_CPU(bool, kernel_neon_busy);
> >
> >Why make this unconditional?  This declaration is only here for
> >may_use_simd() to use.  The stub version of may_use_simd() for the
> >!CONFIG_KERNEL_MODE_NEON case doesn't touch it.
> 
> kernel_neon_busy will be used in fpsimd.c even when with
> !CONFIG_KERNEL_MODE_NEON. So it makes sense to also declare it in the header
> even if not used.

Ah yes, I missed that.

We don't need all this logic in the !CONFIG_KERNEL_MODE_NEON case of
course, but I'm not sure it's worth optimising that special case.
Especially so if we don't see any significant impact in ctxsw-heavy
benchmarks.

> Another solution is to avoid expose kernel_neon_busy outside of fpsimd.c and
> use an helper.

Probably not worth it for now.

> >>+#ifdef CONFIG_KERNEL_MODE_NEON
> >>+
> >>  /*
> >>   * may_use_simd - whether it is allowable at this time to issue SIMD
> >>   *instructions or access the SIMD register file
> >>diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> >>index 5ebe73b69961..b7e5dac26190 100644
> >>--- a/arch/arm64/kernel/fpsimd.c
> >>+++ b/arch/arm64/kernel/fpsimd.c
> >>@@ -90,7 +90,8 @@
> >>   * To prevent this from racing with the manipulation of the task's FPSIMD 
> >> state
> >>   * from task context and thereby corrupting the state, it is necessary to
> >>   * protect any manipulation of a task's fpsimd_state or 
> >> TIF_FOREIGN_FPSTATE
> >>- * flag with local_bh_disable() unless softirqs are already masked.
> >>+ * flag with kernel_neon_{disable, enable}. This will still allow softirqs 
> >>to
> >>+ * run but prevent them to use FPSIMD.
> >>   *
> >>   * For a certain task, the sequence may look something like this:
> >>   * - the task gets scheduled in; if both the task's fpsimd_cpu field
> >>@@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state;
> >>  #endif /* ! CONFIG_ARM64_SVE */
> >>+static void kernel_neon_disable(void);
> >>+static void kernel_neon_enable(void);
> >
> >I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE
> >context access for the current context (i.e., makes it safe).
> >
> >Since these also disable/enable preemption, perhaps we can align them
> >with the existing get_cpu()/put_cpu(), something like:
> >
> >void get_cpu_fpsimd_context();
> >vold put_cpu_fpsimd_context();
> >
> >If you consider it's worth adding the checking helper I alluded to
> >above, it could then be called something like:
> >
> >bool have_cpu_fpsimd_context();
> 
> I am not sure where you suggested a checking helper above. Do you refer to
> exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON?

Hmmm, looks like I got my reply out of order here.

I meant the helper (if any) to check
!preemptible() &&

Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-05 Thread Sebastian Andrzej Siewior
On 2019-04-05 10:02:45 [+0100], Julien Grall wrote:
> RT folks already saw this corruption because local_bh_disable() does not
> preempt on RT. They are carrying a patch (see "arm64: fpsimd: use
> preemp_disable in addition to local_bh_disable()") to disable preemption
> along with local_bh_disable().
> 
> Alternatively, Julia suggested to introduce a per-cpu lock to protect the
> state. I am thinking to defer this for a follow-up patch. The changes in
> this patch should make it easier because we now have helper to mark the
> critical section.

A per-CPU lock? It has to be a raw_spinlock_t because a normal
spin_lock() / local_lock() would allow scheduling and might be taken as
part of the context switch or soon after.

Sebastian


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-05 Thread Julien Grall

Hi Dave,

Thank you for the review.

On 4/4/19 11:52 AM, Dave Martin wrote:

On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote:

For RT-linux, it might be possible to use migrate_{enable, disable}. I
am quite new with RT and have some trouble to understand the semantics
of migrate_{enable, disable}. So far, I am still unsure if it is possible
to run another userspace task on the same CPU while getting preempted
when the migration is disabled.


(Leaving aside the RT aspects for now, but if migrate_{enable,disable}
is costly it might not be the best thing to use deep in context switch
paths, even if is technically correct...)


Based on the discussion in this thread, migrate_enable/migrate_disable 
is not suitable in this context.


The goal of those helpers is to pin the task to the current CPU. On RT, 
it will not disable the preemption. So the current task can be preempted 
by a task with higher priority.


The next task may require to use the FPSIMD and will potentially result 
to corrupt the state.


RT folks already saw this corruption because local_bh_disable() does not 
preempt on RT. They are carrying a patch (see "arm64: fpsimd: use 
preemp_disable in addition to local_bh_disable()") to disable preemption 
along with local_bh_disable().


Alternatively, Julia suggested to introduce a per-cpu lock to protect 
the state. I am thinking to defer this for a follow-up patch. The 
changes in this patch should make it easier because we now have helper 
to mark the critical section.





---
  arch/arm64/include/asm/simd.h |  4 +--
  arch/arm64/kernel/fpsimd.c| 76 +--
  2 files changed, 46 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
index 6495cc51246f..94c0dac508aa 100644
--- a/arch/arm64/include/asm/simd.h
+++ b/arch/arm64/include/asm/simd.h
@@ -15,10 +15,10 @@
  #include 
  #include 
  
-#ifdef CONFIG_KERNEL_MODE_NEON

-
  DECLARE_PER_CPU(bool, kernel_neon_busy);


Why make this unconditional?  This declaration is only here for
may_use_simd() to use.  The stub version of may_use_simd() for the
!CONFIG_KERNEL_MODE_NEON case doesn't touch it.


kernel_neon_busy will be used in fpsimd.c even when with 
!CONFIG_KERNEL_MODE_NEON. So it makes sense to also declare it in the 
header even if not used.


Another solution is to avoid expose kernel_neon_busy outside of fpsimd.c 
and use an helper.




  
+#ifdef CONFIG_KERNEL_MODE_NEON

+
  /*
   * may_use_simd - whether it is allowable at this time to issue SIMD
   *instructions or access the SIMD register file
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 5ebe73b69961..b7e5dac26190 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -90,7 +90,8 @@
   * To prevent this from racing with the manipulation of the task's FPSIMD 
state
   * from task context and thereby corrupting the state, it is necessary to
   * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE
- * flag with local_bh_disable() unless softirqs are already masked.
+ * flag with kernel_neon_{disable, enable}. This will still allow softirqs to
+ * run but prevent them to use FPSIMD.
   *
   * For a certain task, the sequence may look something like this:
   * - the task gets scheduled in; if both the task's fpsimd_cpu field
@@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state;
  
  #endif /* ! CONFIG_ARM64_SVE */
  
+static void kernel_neon_disable(void);

+static void kernel_neon_enable(void);


I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE
context access for the current context (i.e., makes it safe).

Since these also disable/enable preemption, perhaps we can align them
with the existing get_cpu()/put_cpu(), something like:

void get_cpu_fpsimd_context();
vold put_cpu_fpsimd_context();

If you consider it's worth adding the checking helper I alluded to
above, it could then be called something like:

bool have_cpu_fpsimd_context();


I am not sure where you suggested a checking helper above. Do you refer 
to exposing kernel_neon_busy even for !CONFIG_KERNEL_MODE_NEON?





+
  /*
   * Call __sve_free() directly only if you know task can't be scheduled
   * or preempted.
@@ -213,11 +217,11 @@ static void sve_free(struct task_struct *task)
   * thread_struct is known to be up to date, when preparing to enter
   * userspace.
   *
- * Softirqs (and preemption) must be disabled.
+ * Preemption must be disabled.


[*] That's not enough: we need to be in kernel_neon_disable()...
_enable() (i.e., kernel_neon_busy needs to be true to prevent softirqs
messing with the FPSIMD state).


How about not mentioning preemption at all and just say:

"The fpsimd context should be acquired before hand".

This would help if we ever decide to protect critical section differently.




   */
  static void task_fpsimd_load(void)
  {
-   WARN_ON(!in_softirq() && !irqs_disabled());
+ 

Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-04-04 Thread Dave Martin
On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote:
> When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
> the kernel may be able to use FPSIMD/SVE. This is for instance the case
> for crypto code.
> 
> Any use of FPSIMD/SVE in the kernel are clearly marked by using the
> function kernel_neon_{begin, end}. Furthermore, this can only be used
> when may_use_simd() returns true.
> 
> The current implementation of may_use_simd() allows softirq to use
> FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true).
> When in used, softirqs usually fallback to a software method.
> 
> At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled
> when touching the FPSIMD/SVE context. This has the drawback to disable
> all softirqs even if they are not using FPSIMD/SVE.
> 
> As a softirq should not rely on been able to use simd at a given time,
> there are limited reason to keep softirq disabled when touching the
> FPSIMD/SVE context. Instead, we can only disable preemption and tell
> the NEON unit is currently in use.
> 
> This patch introduces two new helpers kernel_neon_{disable, enable} to
> mark the area using FPSIMD/SVE context and use them in replacement of
> local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are
> also re-implemented to use the new helpers.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> 
> I have been exploring this solution as an alternative approach to the RT
> patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()".
> 
> So far, the patch has only been lightly tested.
> 
> For RT-linux, it might be possible to use migrate_{enable, disable}. I
> am quite new with RT and have some trouble to understand the semantics
> of migrate_{enable, disable}. So far, I am still unsure if it is possible
> to run another userspace task on the same CPU while getting preempted
> when the migration is disabled.

(Leaving aside the RT aspects for now, but if migrate_{enable,disable}
is costly it might not be the best thing to use deep in context switch
paths, even if is technically correct...)

> ---
>  arch/arm64/include/asm/simd.h |  4 +--
>  arch/arm64/kernel/fpsimd.c| 76 
> +--
>  2 files changed, 46 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
> index 6495cc51246f..94c0dac508aa 100644
> --- a/arch/arm64/include/asm/simd.h
> +++ b/arch/arm64/include/asm/simd.h
> @@ -15,10 +15,10 @@
>  #include 
>  #include 
>  
> -#ifdef CONFIG_KERNEL_MODE_NEON
> -
>  DECLARE_PER_CPU(bool, kernel_neon_busy);

Why make this unconditional?  This declaration is only here for
may_use_simd() to use.  The stub version of may_use_simd() for the
!CONFIG_KERNEL_MODE_NEON case doesn't touch it.

>  
> +#ifdef CONFIG_KERNEL_MODE_NEON
> +
>  /*
>   * may_use_simd - whether it is allowable at this time to issue SIMD
>   *instructions or access the SIMD register file
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 5ebe73b69961..b7e5dac26190 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -90,7 +90,8 @@
>   * To prevent this from racing with the manipulation of the task's FPSIMD 
> state
>   * from task context and thereby corrupting the state, it is necessary to
>   * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE
> - * flag with local_bh_disable() unless softirqs are already masked.
> + * flag with kernel_neon_{disable, enable}. This will still allow softirqs to
> + * run but prevent them to use FPSIMD.
>   *
>   * For a certain task, the sequence may look something like this:
>   * - the task gets scheduled in; if both the task's fpsimd_cpu field
> @@ -142,6 +143,9 @@ extern void __percpu *efi_sve_state;
>  
>  #endif /* ! CONFIG_ARM64_SVE */
>  
> +static void kernel_neon_disable(void);
> +static void kernel_neon_enable(void);

I find these names a bit confusing: _disable() actualy enables FPSIMD/SVE
context access for the current context (i.e., makes it safe).

Since these also disable/enable preemption, perhaps we can align them
with the existing get_cpu()/put_cpu(), something like:

void get_cpu_fpsimd_context();
vold put_cpu_fpsimd_context();

If you consider it's worth adding the checking helper I alluded to
above, it could then be called something like:

bool have_cpu_fpsimd_context();

> +
>  /*
>   * Call __sve_free() directly only if you know task can't be scheduled
>   * or preempted.
> @@ -213,11 +217,11 @@ static void sve_free(struct task_struct *task)
>   * thread_struct is known to be up to date, when preparing to enter
>   * userspace.
>   *
> - * Softirqs (and preemption) must be disabled.
> + * Preemption must be disabled.

[*] That's not enough: we need to be in kernel_neon_disable()...
_enable() (i.e., kernel_neon_busy needs to be true to prevent softirqs
messing with the FPSIMD state).

>   */
>  static void task_f

Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-03-15 Thread Julien Grall




On 15/03/2019 10:06, Dave Martin wrote:

On Thu, Mar 14, 2019 at 06:07:19PM +, Julien Grall wrote:

Hi Sebastian,

On 3/4/19 5:25 PM, Sebastian Andrzej Siewior wrote:


[...]


It would increase the softirq latency but the question is how bad would
it be. It would continue once the SIMD section is done.


On Arm, the kernel may use either FPSIMD or SVE (if supported by the
platform). While the FPSIMD context is fairly small (~4K), the SVE context
can be up to ~64KB.


It's not quite as bad as that.  The FPSIMD state is ~0.5K, with the SVE
state size being up to ~8.5K (though for today's implementations ~2K may
be considered typical).


Whoops, I forgot to divide by 8. Thank you for spotting it!

Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-03-15 Thread Dave Martin
On Thu, Mar 14, 2019 at 06:07:19PM +, Julien Grall wrote:
> Hi Sebastian,
> 
> On 3/4/19 5:25 PM, Sebastian Andrzej Siewior wrote:

[...]

> >It would increase the softirq latency but the question is how bad would
> >it be. It would continue once the SIMD section is done.
> 
> On Arm, the kernel may use either FPSIMD or SVE (if supported by the
> platform). While the FPSIMD context is fairly small (~4K), the SVE context
> can be up to ~64KB.

It's not quite as bad as that.  The FPSIMD state is ~0.5K, with the SVE
state size being up to ~8.5K (though for today's implementations ~2K may
be considered typical).

For comparision, I believe AVX-512 has ~2K of state.

Cheers
---Dave


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-03-14 Thread Julien Grall

Hi Sebastian,

On 3/4/19 5:25 PM, Sebastian Andrzej Siewior wrote:

On 2019-02-18 15:07:51 [+], Julien Grall wrote:

Hi,

Hi,


Wouldn't this arbitrarily increase softirq latency?  Unconditionally
forbidding SIMD in softirq might make more sense.  It depends on how
important the use cases are...


It would increase the softirq latency but the question is how bad would
it be. It would continue once the SIMD section is done.


On Arm, the kernel may use either FPSIMD or SVE (if supported by the 
platform). While the FPSIMD context is fairly small (~4K), the SVE 
context can be up to ~64KB.



If you allow it but made it impossible to use (and use the software
fallback) then it would slow down the processing. So…


This is a fair point. However, the use of crypto in softirqs seem to be 
limited. So I am wondering whether disabling softirq in all the case is 
worth it.


Would it be possible to consider to forbid/warn about using crypto in 
softirqs?





Looking at the commit message from cb84d11e1625 "arm64: neon: Remove support
for nested or hardirq kernel-mode NEON", one of the use case for crypto in
softirq is certain mac80211 drivers.

Is there any other use case for use crypto in softirqs?


mac80211 does it for some wifi drivers. There used to be IPsec but I
*think* this moved to the "parallel processing kthread".


I was able to find my way through mac80211 and confirm the use a taslket 
and therefore softirqs. However, I got lost in the ipsec code.



During my FPU rework on x86 I didn't find anything that does the
processing in softirq (on my machine) so I hacked something so that I
could test that I didn't break anything…


This is the same on the platform I have been using for testing.

Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-03-04 Thread Sebastian Andrzej Siewior
On 2019-02-18 15:07:51 [+], Julien Grall wrote:
> Hi,
Hi,

> > Wouldn't this arbitrarily increase softirq latency?  Unconditionally
> > forbidding SIMD in softirq might make more sense.  It depends on how
> > important the use cases are...

It would increase the softirq latency but the question is how bad would
it be. It would continue once the SIMD section is done.
If you allow it but made it impossible to use (and use the software
fallback) then it would slow down the processing. So…

> Looking at the commit message from cb84d11e1625 "arm64: neon: Remove support
> for nested or hardirq kernel-mode NEON", one of the use case for crypto in
> softirq is certain mac80211 drivers.
> 
> Is there any other use case for use crypto in softirqs?

mac80211 does it for some wifi drivers. There used to be IPsec but I
*think* this moved to the "parallel processing kthread".
During my FPU rework on x86 I didn't find anything that does the
processing in softirq (on my machine) so I hacked something so that I
could test that I didn't break anything…

> Cheers,
> 

Sebastian


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-18 Thread Julien Grall

Hi all,

On 08/02/2019 16:55, Julien Grall wrote:

@@ -1094,9 +1113,7 @@ void kernel_neon_begin(void)
/* Invalidate any task state remaining in the fpsimd regs: */
fpsimd_flush_cpu_state();
  
-	preempt_disable();

-
-   local_bh_enable();
+   kernel_neon_enable();


I found one error in the patch. We should not re-enable NEON from 
kernel_neon_begin(). I will resend a patch once the thread settle down.


Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-18 Thread Julien Grall

Hi,

On 14/02/2019 10:34, Dave Martin wrote:

On Wed, Feb 13, 2019 at 05:52:27PM +0100, Sebastian Andrzej Siewior wrote:

On 2019-02-13 16:40:00 [+0100], Ard Biesheuvel wrote:

This is equal what x86 is currently doing. The naming is slightly
different, there is irq_fpu_usable().


Yes, I think it's basically the same idea.

It's been evolving a bit on both sides, but is quite similar now.



may_use_simd() only exists because we have a generic crypto SIMD
helper, and so we needed something arch agnostic to wrap around
irq_fpu_usable()


My question was more if this is helpful and we want to keep or if
it would be better to remove it and always disable BH as part of SIMD
operations.


Wouldn't this arbitrarily increase softirq latency?  Unconditionally
forbidding SIMD in softirq might make more sense.  It depends on how
important the use cases are...


Looking at the commit message from cb84d11e1625 "arm64: neon: Remove support for 
nested or hardirq kernel-mode NEON", one of the use case for crypto in softirq 
is certain mac80211 drivers.


Is there any other use case for use crypto in softirqs?

Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-18 Thread Julien Grall

Hello Sebastian,

On 13/02/2019 14:30, Sebastian Andrzej Siewior wrote:

On 2019-02-08 16:55:13 [+], Julien Grall wrote:

When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
the kernel may be able to use FPSIMD/SVE. This is for instance the case
for crypto code.

Any use of FPSIMD/SVE in the kernel are clearly marked by using the
function kernel_neon_{begin, end}. Furthermore, this can only be used
when may_use_simd() returns true.


This is equal what x86 is currently doing. The naming is slightly
different, there is irq_fpu_usable().
The idea behind the patch was taken from x86. On x86, softirq does not seem to 
be disabled when context switching the FPUs registers.





The current implementation of may_use_simd() allows softirq to use
FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true).
When in used, softirqs usually fallback to a software method.

At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled
when touching the FPSIMD/SVE context. This has the drawback to disable
all softirqs even if they are not using FPSIMD/SVE.


Is this bad? This means also that your crypto code will not be
interrupted by a softirq. Also if you would get rid of it, you could
avoid the software fallback in case may_use_simd() says false.


There seem to have some misunderstanding about the purpose of this patch. Any 
use of crypto in the kernel is only protected by preempt_disable(). softirqs are 
only disabled when context switching the FPU register between two userspace tasks.


From the commit log cb84d11e1625 "arm64: neon: Remove support for nested or 
hardirq kernel-mode NEON" this was done to protect against rare softirqs use 
crypto. It seems to me this is a bit overkill to increase softirq latency if 
they barely use FPSIMD/SVE. Indeed, the SVE context can be quite large, 
therefore it can take some times to save/restore it.


[...]


For RT-linux, it might be possible to use migrate_{enable, disable}. I
am quite new with RT and have some trouble to understand the semantics
of migrate_{enable, disable}. So far, I am still unsure if it is possible
to run another userspace task on the same CPU while getting preempted
when the migration is disabled.


In RT:
- preemt_disable() is the same as !RT. A thread can not be suspend. An
   interrupt may interrupt. However on RT we have threaded interrupts so
   the interrupt is limited to the first-level handler (not the threaded
   handler).

- migrate_disable() means that the thread can not be moved to another
   CPU. It can be suspended.

- local_bh_disable() disables the BH: No softirq can run. In RT
   local_bh_disable() does not inherit preempt_disable(). Two different
   softirqs can be executed in parallel.
   The BH is usually invoked at the end of the threaded interrupt
   (because the threaded interrupt handler raises the softirq). It can
   also run in the ksoftirqd.

Usually you should not get preempted in a migrate_disable() section. A
SCHED_OTHER task should not interrupt another SCHED_OTHER task in a
migrate_disable() section. A task with a higher priority (a RT/DL task)
will. Since threaded interrupts run with a RT priority of 50, they will
interrupt your task in a migrate_disable() section.


Thank you for the explanation!

Cheers,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-18 Thread Julien Grall

On 12/02/2019 17:13, Julia Cartwright wrote:

Hello Julien-


Hello Julien,



On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote:

When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
the kernel may be able to use FPSIMD/SVE. This is for instance the case
for crypto code.

Any use of FPSIMD/SVE in the kernel are clearly marked by using the
function kernel_neon_{begin, end}. Furthermore, this can only be used
when may_use_simd() returns true.

The current implementation of may_use_simd() allows softirq to use
FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true).
When in used, softirqs usually fallback to a software method.

At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled
when touching the FPSIMD/SVE context. This has the drawback to disable
all softirqs even if they are not using FPSIMD/SVE.

As a softirq should not rely on been able to use simd at a given time,
there are limited reason to keep softirq disabled when touching the
FPSIMD/SVE context. Instead, we can only disable preemption and tell
the NEON unit is currently in use.

This patch introduces two new helpers kernel_neon_{disable, enable} to
mark the area using FPSIMD/SVE context and use them in replacement of
local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are
also re-implemented to use the new helpers.

Signed-off-by: Julien Grall 

---

I have been exploring this solution as an alternative approach to the RT
patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()".

So far, the patch has only been lightly tested.

For RT-linux, it might be possible to use migrate_{enable, disable}. I
am quite new with RT and have some trouble to understand the semantics
of migrate_{enable, disable}. So far, I am still unsure if it is possible
to run another userspace task on the same CPU while getting preempted
when the migration is disabled.


Yes, a thread in a migrate_disable()-protected critical section can be
preempted, and another thread on the same CPU may enter the critical
section.

If it's necessary to prevent this from happening, then you need to also
make use of a per-CPU mutex.  On RT, this would do the right thing
w.r.t. priority inheritance.


Thank you for the explanation, I now understand better the concept of 
migrate_disable.


Best regards,

--
Julien Grall


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-14 Thread Dave Martin
On Wed, Feb 13, 2019 at 05:52:27PM +0100, Sebastian Andrzej Siewior wrote:
> On 2019-02-13 16:40:00 [+0100], Ard Biesheuvel wrote:
> > > > This is equal what x86 is currently doing. The naming is slightly
> > > > different, there is irq_fpu_usable().
> > >
> > > Yes, I think it's basically the same idea.
> > >
> > > It's been evolving a bit on both sides, but is quite similar now.
> > >
> > 
> > may_use_simd() only exists because we have a generic crypto SIMD
> > helper, and so we needed something arch agnostic to wrap around
> > irq_fpu_usable()
> 
> My question was more if this is helpful and we want to keep or if
> it would be better to remove it and always disable BH as part of SIMD
> operations.

Wouldn't this arbitrarily increase softirq latency?  Unconditionally
forbidding SIMD in softirq might make more sense.  It depends on how
important the use cases are...

Cheers
---Dave


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-13 Thread Sebastian Andrzej Siewior
On 2019-02-13 16:40:00 [+0100], Ard Biesheuvel wrote:
> > > This is equal what x86 is currently doing. The naming is slightly
> > > different, there is irq_fpu_usable().
> >
> > Yes, I think it's basically the same idea.
> >
> > It's been evolving a bit on both sides, but is quite similar now.
> >
> 
> may_use_simd() only exists because we have a generic crypto SIMD
> helper, and so we needed something arch agnostic to wrap around
> irq_fpu_usable()

My question was more if this is helpful and we want to keep or if
it would be better to remove it and always disable BH as part of SIMD
operations.

Sebastian


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-13 Thread Sebastian Andrzej Siewior
On 2019-02-13 15:36:30 [+], Dave Martin wrote:
> On Wed, Feb 13, 2019 at 03:30:29PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2019-02-08 16:55:13 [+], Julien Grall wrote:
> > > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
> > > the kernel may be able to use FPSIMD/SVE. This is for instance the case
> > > for crypto code.
> > > 
> > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the
> > > function kernel_neon_{begin, end}. Furthermore, this can only be used
> > > when may_use_simd() returns true.
> > 
> > This is equal what x86 is currently doing. The naming is slightly
> > different, there is irq_fpu_usable().
> 
> Yes, I think it's basically the same idea.
> 
> It's been evolving a bit on both sides, but is quite similar now.

I though that this is complicated and wanted to submit a patch to remove
irq_fpu_usable() and disable BH as part of kernel_fpu_begin() (but have
currently the onother FPU series ongoing which I want to finish first).

…
> "Usually" is probably not good enough if another task can run: if the
> preempting task enters userspace then the vector registers are needed
> for its use, which is tricky to arrange if the registers are currently
> in use by context switch logic running in the first task.

Yes.

> My current feeling is that we probably have to stick with
> preempt_disable() here, but hopefully we can get rid of
> local_bh_disable() (as proposed) with no ill effects...
> 
> Does that sound sensible?

If you want to stick with may_use_simd() then yes.

> Cheers
> ---Dave

Sebastian


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-13 Thread Dave Martin
On Wed, Feb 13, 2019 at 04:40:00PM +0100, Ard Biesheuvel wrote:
> On Wed, 13 Feb 2019 at 16:36, Dave Martin  wrote:
> >
> > On Wed, Feb 13, 2019 at 03:30:29PM +0100, Sebastian Andrzej Siewior wrote:
> > > On 2019-02-08 16:55:13 [+], Julien Grall wrote:
> > > > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
> > > > the kernel may be able to use FPSIMD/SVE. This is for instance the case
> > > > for crypto code.
> > > >
> > > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the
> > > > function kernel_neon_{begin, end}. Furthermore, this can only be used
> > > > when may_use_simd() returns true.
> > >
> > > This is equal what x86 is currently doing. The naming is slightly
> > > different, there is irq_fpu_usable().
> >
> > Yes, I think it's basically the same idea.
> >
> > It's been evolving a bit on both sides, but is quite similar now.
> >
> 
> may_use_simd() only exists because we have a generic crypto SIMD
> helper, and so we needed something arch agnostic to wrap around
> irq_fpu_usable()

[...]

Sounds plausible.

Cheers
---Dave


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-13 Thread Ard Biesheuvel
On Wed, 13 Feb 2019 at 16:36, Dave Martin  wrote:
>
> On Wed, Feb 13, 2019 at 03:30:29PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2019-02-08 16:55:13 [+], Julien Grall wrote:
> > > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
> > > the kernel may be able to use FPSIMD/SVE. This is for instance the case
> > > for crypto code.
> > >
> > > Any use of FPSIMD/SVE in the kernel are clearly marked by using the
> > > function kernel_neon_{begin, end}. Furthermore, this can only be used
> > > when may_use_simd() returns true.
> >
> > This is equal what x86 is currently doing. The naming is slightly
> > different, there is irq_fpu_usable().
>
> Yes, I think it's basically the same idea.
>
> It's been evolving a bit on both sides, but is quite similar now.
>

may_use_simd() only exists because we have a generic crypto SIMD
helper, and so we needed something arch agnostic to wrap around
irq_fpu_usable()

> > > The current implementation of may_use_simd() allows softirq to use
> > > FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true).
> > > When in used, softirqs usually fallback to a software method.
> > >
> > > At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled
> > > when touching the FPSIMD/SVE context. This has the drawback to disable
> > > all softirqs even if they are not using FPSIMD/SVE.
> >
> > Is this bad? This means also that your crypto code will not be
> > interrupted by a softirq. Also if you would get rid of it, you could
> > avoid the software fallback in case may_use_simd() says false.
>
> Masking softirqs during kernel_neon_begin()..._end() is unlikely to be a
> huge problem, but currently we block softirq during all context switch
> operations that act on the CPU vector registers.
>
> The reasons for this are somewhat historical, and IIRC predated the
> requirement for softirq users of kernel-mode NEON to include the
> may_use_simd() check and implement a fallback path on arm64.
>
> Now that softirq code is required to work around kernel-mode NEON being
> temporarily unusable, masking softirqs completely during context switch
> etc. should no longer be necessary.
>
> > > As a softirq should not rely on been able to use simd at a given time,
> > > there are limited reason to keep softirq disabled when touching the
> > > FPSIMD/SVE context. Instead, we can only disable preemption and tell
> > > the NEON unit is currently in use.
> > >
> > > This patch introduces two new helpers kernel_neon_{disable, enable} to
> > > mark the area using FPSIMD/SVE context and use them in replacement of
> > > local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are
> > > also re-implemented to use the new helpers.
> > >
> > > Signed-off-by: Julien Grall 
> > >
> > > ---
> > >
> > > I have been exploring this solution as an alternative approach to the RT
> > > patch "arm64: fpsimd: use preemp_disable in addition to 
> > > local_bh_disable()".
> > >
> > > So far, the patch has only been lightly tested.
> > >
> > > For RT-linux, it might be possible to use migrate_{enable, disable}. I
> > > am quite new with RT and have some trouble to understand the semantics
> > > of migrate_{enable, disable}. So far, I am still unsure if it is possible
> > > to run another userspace task on the same CPU while getting preempted
> > > when the migration is disabled.
> >
> > In RT:
> > - preemt_disable() is the same as !RT. A thread can not be suspend. An
> >   interrupt may interrupt. However on RT we have threaded interrupts so
> >   the interrupt is limited to the first-level handler (not the threaded
> >   handler).
> >
> > - migrate_disable() means that the thread can not be moved to another
> >   CPU. It can be suspended.
> >
> > - local_bh_disable() disables the BH: No softirq can run. In RT
> >   local_bh_disable() does not inherit preempt_disable(). Two different
> >   softirqs can be executed in parallel.
> >   The BH is usually invoked at the end of the threaded interrupt
> >   (because the threaded interrupt handler raises the softirq). It can
> >   also run in the ksoftirqd.
> >
> > Usually you should not get preempted in a migrate_disable() section. A
> > SCHED_OTHER task should not interrupt another SCHED_OTHER task in a
> > migrate_disable() section. A task with a higher priority (a RT/DL task)
> > will. Since threaded interrupts run with a RT priority of 50, they will
> > interrupt your task in a migrate_disable() section.
>
> "Usually" is probably not good enough if another task can run: if the
> preempting task enters userspace then the vector registers are needed
> for its use, which is tricky to arrange if the registers are currently
> in use by context switch logic running in the first task.
>
> My current feeling is that we probably have to stick with
> preempt_disable() here, but hopefully we can get rid of
> local_bh_disable() (as proposed) with no ill effects...
>
> Does that sound sensible?
>
> Cheers
> ---Dave


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-13 Thread Dave Martin
On Wed, Feb 13, 2019 at 03:30:29PM +0100, Sebastian Andrzej Siewior wrote:
> On 2019-02-08 16:55:13 [+], Julien Grall wrote:
> > When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
> > the kernel may be able to use FPSIMD/SVE. This is for instance the case
> > for crypto code.
> > 
> > Any use of FPSIMD/SVE in the kernel are clearly marked by using the
> > function kernel_neon_{begin, end}. Furthermore, this can only be used
> > when may_use_simd() returns true.
> 
> This is equal what x86 is currently doing. The naming is slightly
> different, there is irq_fpu_usable().

Yes, I think it's basically the same idea.

It's been evolving a bit on both sides, but is quite similar now.

> > The current implementation of may_use_simd() allows softirq to use
> > FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true).
> > When in used, softirqs usually fallback to a software method.
> > 
> > At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled
> > when touching the FPSIMD/SVE context. This has the drawback to disable
> > all softirqs even if they are not using FPSIMD/SVE.
> 
> Is this bad? This means also that your crypto code will not be
> interrupted by a softirq. Also if you would get rid of it, you could
> avoid the software fallback in case may_use_simd() says false.

Masking softirqs during kernel_neon_begin()..._end() is unlikely to be a
huge problem, but currently we block softirq during all context switch
operations that act on the CPU vector registers.

The reasons for this are somewhat historical, and IIRC predated the
requirement for softirq users of kernel-mode NEON to include the
may_use_simd() check and implement a fallback path on arm64.

Now that softirq code is required to work around kernel-mode NEON being
temporarily unusable, masking softirqs completely during context switch
etc. should no longer be necessary.

> > As a softirq should not rely on been able to use simd at a given time,
> > there are limited reason to keep softirq disabled when touching the
> > FPSIMD/SVE context. Instead, we can only disable preemption and tell
> > the NEON unit is currently in use.
> > 
> > This patch introduces two new helpers kernel_neon_{disable, enable} to
> > mark the area using FPSIMD/SVE context and use them in replacement of
> > local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are
> > also re-implemented to use the new helpers.
> > 
> > Signed-off-by: Julien Grall 
> > 
> > ---
> > 
> > I have been exploring this solution as an alternative approach to the RT
> > patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()".
> > 
> > So far, the patch has only been lightly tested.
> > 
> > For RT-linux, it might be possible to use migrate_{enable, disable}. I
> > am quite new with RT and have some trouble to understand the semantics
> > of migrate_{enable, disable}. So far, I am still unsure if it is possible
> > to run another userspace task on the same CPU while getting preempted
> > when the migration is disabled.
> 
> In RT:
> - preemt_disable() is the same as !RT. A thread can not be suspend. An
>   interrupt may interrupt. However on RT we have threaded interrupts so
>   the interrupt is limited to the first-level handler (not the threaded
>   handler).
> 
> - migrate_disable() means that the thread can not be moved to another
>   CPU. It can be suspended.
> 
> - local_bh_disable() disables the BH: No softirq can run. In RT
>   local_bh_disable() does not inherit preempt_disable(). Two different
>   softirqs can be executed in parallel.
>   The BH is usually invoked at the end of the threaded interrupt
>   (because the threaded interrupt handler raises the softirq). It can
>   also run in the ksoftirqd.
> 
> Usually you should not get preempted in a migrate_disable() section. A
> SCHED_OTHER task should not interrupt another SCHED_OTHER task in a
> migrate_disable() section. A task with a higher priority (a RT/DL task)
> will. Since threaded interrupts run with a RT priority of 50, they will
> interrupt your task in a migrate_disable() section.

"Usually" is probably not good enough if another task can run: if the
preempting task enters userspace then the vector registers are needed
for its use, which is tricky to arrange if the registers are currently
in use by context switch logic running in the first task.

My current feeling is that we probably have to stick with
preempt_disable() here, but hopefully we can get rid of
local_bh_disable() (as proposed) with no ill effects...

Does that sound sensible?

Cheers
---Dave


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-13 Thread Sebastian Andrzej Siewior
On 2019-02-08 16:55:13 [+], Julien Grall wrote:
> When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
> the kernel may be able to use FPSIMD/SVE. This is for instance the case
> for crypto code.
> 
> Any use of FPSIMD/SVE in the kernel are clearly marked by using the
> function kernel_neon_{begin, end}. Furthermore, this can only be used
> when may_use_simd() returns true.

This is equal what x86 is currently doing. The naming is slightly
different, there is irq_fpu_usable().

> The current implementation of may_use_simd() allows softirq to use
> FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true).
> When in used, softirqs usually fallback to a software method.
> 
> At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled
> when touching the FPSIMD/SVE context. This has the drawback to disable
> all softirqs even if they are not using FPSIMD/SVE.

Is this bad? This means also that your crypto code will not be
interrupted by a softirq. Also if you would get rid of it, you could
avoid the software fallback in case may_use_simd() says false.

> As a softirq should not rely on been able to use simd at a given time,
> there are limited reason to keep softirq disabled when touching the
> FPSIMD/SVE context. Instead, we can only disable preemption and tell
> the NEON unit is currently in use.
> 
> This patch introduces two new helpers kernel_neon_{disable, enable} to
> mark the area using FPSIMD/SVE context and use them in replacement of
> local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are
> also re-implemented to use the new helpers.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> 
> I have been exploring this solution as an alternative approach to the RT
> patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()".
> 
> So far, the patch has only been lightly tested.
> 
> For RT-linux, it might be possible to use migrate_{enable, disable}. I
> am quite new with RT and have some trouble to understand the semantics
> of migrate_{enable, disable}. So far, I am still unsure if it is possible
> to run another userspace task on the same CPU while getting preempted
> when the migration is disabled.

In RT:
- preemt_disable() is the same as !RT. A thread can not be suspend. An
  interrupt may interrupt. However on RT we have threaded interrupts so
  the interrupt is limited to the first-level handler (not the threaded
  handler).

- migrate_disable() means that the thread can not be moved to another
  CPU. It can be suspended.

- local_bh_disable() disables the BH: No softirq can run. In RT
  local_bh_disable() does not inherit preempt_disable(). Two different
  softirqs can be executed in parallel.
  The BH is usually invoked at the end of the threaded interrupt
  (because the threaded interrupt handler raises the softirq). It can
  also run in the ksoftirqd.

Usually you should not get preempted in a migrate_disable() section. A
SCHED_OTHER task should not interrupt another SCHED_OTHER task in a
migrate_disable() section. A task with a higher priority (a RT/DL task)
will. Since threaded interrupts run with a RT priority of 50, they will
interrupt your task in a migrate_disable() section.


Sebastian


Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-02-12 Thread Julia Cartwright
Hello Julien-

On Fri, Feb 08, 2019 at 04:55:13PM +, Julien Grall wrote:
> When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
> the kernel may be able to use FPSIMD/SVE. This is for instance the case
> for crypto code.
> 
> Any use of FPSIMD/SVE in the kernel are clearly marked by using the
> function kernel_neon_{begin, end}. Furthermore, this can only be used
> when may_use_simd() returns true.
> 
> The current implementation of may_use_simd() allows softirq to use
> FPSIMD/SVE unless it is currently in used (i.e kernel_neon_busy is true).
> When in used, softirqs usually fallback to a software method.
> 
> At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled
> when touching the FPSIMD/SVE context. This has the drawback to disable
> all softirqs even if they are not using FPSIMD/SVE.
> 
> As a softirq should not rely on been able to use simd at a given time,
> there are limited reason to keep softirq disabled when touching the
> FPSIMD/SVE context. Instead, we can only disable preemption and tell
> the NEON unit is currently in use.
> 
> This patch introduces two new helpers kernel_neon_{disable, enable} to
> mark the area using FPSIMD/SVE context and use them in replacement of
> local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are
> also re-implemented to use the new helpers.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> 
> I have been exploring this solution as an alternative approach to the RT
> patch "arm64: fpsimd: use preemp_disable in addition to local_bh_disable()".
> 
> So far, the patch has only been lightly tested.
> 
> For RT-linux, it might be possible to use migrate_{enable, disable}. I
> am quite new with RT and have some trouble to understand the semantics
> of migrate_{enable, disable}. So far, I am still unsure if it is possible
> to run another userspace task on the same CPU while getting preempted
> when the migration is disabled.

Yes, a thread in a migrate_disable()-protected critical section can be
preempted, and another thread on the same CPU may enter the critical
section.

If it's necessary to prevent this from happening, then you need to also
make use of a per-CPU mutex.  On RT, this would do the right thing
w.r.t. priority inheritance.

   Julia