[PATCH 0/2] efi: Allow to auto-disable it on RT

2019-08-16 Thread Sebastian Andrzej Siewior
Hi,

since we have CONFIG_PREEMPT_RT these two patches allow to auto-disable
EFI runtime services on RT while it is still possible to override it and
use it if needed.

Sebastian



[PATCH 1/2] efi: Allow efi=runtime

2019-08-16 Thread Sebastian Andrzej Siewior
In case the option "efi=noruntime" is default at built-time, the user
could overwrite its state by `efi=runtime' and allow it again.

Acked-by: Ard Biesheuvel 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/firmware/efi/efi.c |3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -114,6 +114,9 @@ static int __init parse_efi_cmdline(char
if (parse_option_str(str, "noruntime"))
disable_runtime = true;
 
+   if (parse_option_str(str, "runtime"))
+   disable_runtime = false;
+
return 0;
 }
 early_param("efi", parse_efi_cmdline);


[PATCH 2/2] efi: Disable runtime services on RT

2019-08-16 Thread Sebastian Andrzej Siewior
Based on meassurements the EFI functions get_variable /
get_next_variable take up to 2us which looks okay.
The functions get_time, set_time take around 10ms. Those 10ms are too
much. Even one ms would be too much.
Ard mentioned that SetVariable might even trigger larger latencies if
the firware will erase flash blocks on NOR.

The time-functions are used by efi-rtc and can be triggered during
runtimed (either via explicit read/write or ntp sync).

The variable write could be used by pstore.
These functions can be disabled without much of a loss. The poweroff /
reboot hooks may be provided by PSCI.

Disable EFI's runtime wrappers.

This was observed on "EFI v2.60 by SoftIron Overdrive 1000".

Acked-by: Ard Biesheuvel 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/firmware/efi/efi.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -88,7 +88,7 @@ struct mm_struct efi_mm = {
 
 struct workqueue_struct *efi_rts_wq;
 
-static bool disable_runtime;
+static bool disable_runtime = IS_ENABLED(CONFIG_PREEMPT_RT);
 static int __init setup_noefi(char *arg)
 {
disable_runtime = true;


Re: [PATCH] pstore: Convert buf_lock to semaphore

2018-12-04 Thread Sebastian Andrzej Siewior
On 2018-12-04 09:23:13 [-0800], Kees Cook wrote:
> Okay, so, if kmsg_dump() uses rcu_read_lock(), that means efi-pstore
> can _never_ sleep, and it's nothing to do with pstore internals. :( I
> guess we just hard-code it, then? And efi-pstore should probably only
> attach to pstore if it has a nonblock implementation (and warn if one
> isn't available).

I was about to suggest that. I am not aware if anything else could use
efi_pstore_write() use that but otherwise you could hardcode it.

> -Kees
> 
Sebastian


Re: [PATCH] pstore: Convert buf_lock to semaphore

2018-12-04 Thread Sebastian Andrzej Siewior
On 2018-11-30 14:47:36 [-0800], Kees Cook wrote:
> diff --git a/drivers/firmware/efi/efi-pstore.c 
> b/drivers/firmware/efi/efi-pstore.c
> index cfe87b465819..0f7d97917197 100644
> --- a/drivers/firmware/efi/efi-pstore.c
> +++ b/drivers/firmware/efi/efi-pstore.c
> @@ -259,8 +259,7 @@ static int efi_pstore_write(struct pstore_record *record)
>   efi_name[i] = name[i];
>  
>   ret = efivar_entry_set_safe(efi_name, vendor, PSTORE_EFI_ATTRIBUTES,
> -   !pstore_cannot_block_path(record->reason),
> -   record->size, record->psi->buf);
> +   preemptible(), record->size, record->psi->buf);

Well. Better I think.
might_sleep() / preempt_count_equals() checks for preemptible() + 
rcu_preempt_depth().
kmsg_dump() starts with rcu_read_lock() which means with this patch applied I
got:

| BUG: sleeping function called from invalid context at 
kernel/sched/completion.c:99
| in_atomic(): 0, irqs_disabled(): 0, pid: 2286, name: sig-xstate-bum PC: 0 
RCU: 1
| Preemption disabled at:
| [] __queue_work+0x95/0x440
| CPU: 30 PID: 2286 Comm: sig-xstate-bum Tainted: G  D   
4.20.0-rc3+ #90
| Call Trace:
|  dump_stack+0x4f/0x6a
|  ___might_sleep.cold.91+0xef/0x100
|  __might_sleep+0x50/0x90
|  wait_for_completion+0x32/0x130
|  virt_efi_query_variable_info+0x14e/0x160
|  efi_query_variable_store+0x51/0x1a0
|  efivar_entry_set_safe+0xa3/0x1b0
|  efi_pstore_write+0x110/0x140
|  pstore_dump+0x114/0x320
|  kmsg_dump+0xa4/0xd0
|  oops_exit+0x7f/0x90
|  oops_end+0x67/0xd0
|  die+0x41/0x4a
|  do_general_protection+0xc1/0x150
|  general_protection+0x1e/0x30
| RIP: 0010:__fpu__restore_sig+0x1c1/0x540

just in case you wonder why both counter are zero and it still creates
this backtrace.

>   if (record->reason == KMSG_DUMP_OOPS)
>   efivar_run_worker();
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index 2387cb74f729..afdfd3687f94 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -400,23 +401,20 @@ static void pstore_dump(struct kmsg_dumper *dumper,
>   unsigned long   total = 0;
>   const char  *why;
>   unsigned intpart = 1;
> - unsigned long   flags = 0;
> - int is_locked;
>   int ret;
>  
>   why = get_reason_str(reason);
>  
> - if (pstore_cannot_block_path(reason)) {
> - is_locked = spin_trylock_irqsave(&psinfo->buf_lock, flags);
> - if (!is_locked) {
> - pr_err("pstore dump routine blocked in %s path, may 
> corrupt error record\n"
> -, in_nmi() ? "NMI" : why);
> + if (down_trylock(&psinfo->buf_lock)) {
> + /* Failed to acquire lock: give up if we cannot wait. */
> + if (pstore_cannot_wait(reason)) {
> + pr_err("dump skipped in %s path: may corrupt error 
> record\n",
> + in_nmi() ? "NMI" : why);
>   return;
>   }
> - } else {
> - spin_lock_irqsave(&psinfo->buf_lock, flags);
> - is_locked = 1;
> + down_interruptible(&psinfo->buf_lock);

 In function ‘pstore_dump’:
fs/pstore/platform.c:393:3: warning: ignoring return value of 
‘down_interruptible’, declared with attribute warn_unused_result 
[-Wunused-result]
   down_interruptible(&psinfo->buf_lock);
   ^

>   }

Sebastian


Re: [tip:x86/fpu] x86/fpu: Don't export __kernel_fpu_{begin,end}()

2018-12-04 Thread Sebastian Andrzej Siewior
On 2018-12-03 23:08:41 [+0100], Borislav Petkov wrote:
> On Mon, Dec 03, 2018 at 10:12:19PM +0100, Ard Biesheuvel wrote:
> > > + * Using the FPU in hardirq is not allowed.
> > 
> > According to the documentation in x86/kernel/fpu/core.c, this is not
> > true. So which one is accurate?
> 
> I think you mean the irq from user mode... Yap, we do allow that.
> 
> Sebastian?

Do you refer to
| *   - by IRQ context code to potentially use the FPU
| * if it's unused.
  
? It is possible to use the FPU in IRQ context.
The FPU could be used in user-context surrounded by kernel_fpu_begin().
This only disables preemption so an IRQ could interrupt it. This IRQ
could then use the FPU or raise a SoftIRQ which would use it.
Therefore on x86 it is required to check with irq_fpu_usable() if the
FPU can be used. If the FPU can not be used, you have to implement
fallback code.

With the "restore FPU on return to userland" series we need to modify
the FPU in a few places. The softirq and preemption is disabled. I
didn't find any in-IRQ users.
Going forward I would like to remove the in-IRQ part and
irq_fpu_usable() and disable softirq as part of kernel_fpu_begin().

> Thx.

Sebastian


[tip:x86/fpu] x86/fpu: Don't export __kernel_fpu_{begin,end}()

2018-12-04 Thread tip-bot for Sebastian Andrzej Siewior
Commit-ID:  12209993e98c5fa1855c467f22a24e3d5b8be205
Gitweb: https://git.kernel.org/tip/12209993e98c5fa1855c467f22a24e3d5b8be205
Author: Sebastian Andrzej Siewior 
AuthorDate: Thu, 29 Nov 2018 16:02:10 +0100
Committer:  Borislav Petkov 
CommitDate: Tue, 4 Dec 2018 12:37:28 +0100

x86/fpu: Don't export __kernel_fpu_{begin,end}()

There is one user of __kernel_fpu_begin() and before invoking it,
it invokes preempt_disable(). So it could invoke kernel_fpu_begin()
right away. The 32bit version of arch_efi_call_virt_setup() and
arch_efi_call_virt_teardown() does this already.

The comment above *kernel_fpu*() claims that before invoking
__kernel_fpu_begin() preemption should be disabled and that KVM is a
good example of doing it. Well, KVM doesn't do that since commit

  f775b13eedee2 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")

so it is not an example anymore.

With EFI gone as the last user of __kernel_fpu_{begin|end}(), both can
be made static and not exported anymore.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Borislav Petkov 
Reviewed-by: Rik van Riel 
Cc: "H. Peter Anvin" 
Cc: "Jason A. Donenfeld" 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Dave Hansen 
Cc: Ingo Molnar 
Cc: Nicolai Stange 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Thomas Gleixner 
Cc: kvm ML 
Cc: linux-efi 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20181129150210.2k4mawt37ow6c...@linutronix.de
---
 arch/x86/include/asm/efi.h |  6 ++
 arch/x86/include/asm/fpu/api.h | 15 +--
 arch/x86/kernel/fpu/core.c |  6 ++
 3 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index eea40d52ca78..45864898f7e5 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -82,8 +82,7 @@ struct efi_scratch {
 #define arch_efi_call_virt_setup() \
 ({ \
efi_sync_low_kernel_mappings(); \
-   preempt_disable();  \
-   __kernel_fpu_begin();   \
+   kernel_fpu_begin(); \
firmware_restrict_branch_speculation_start();   \
\
if (!efi_enabled(EFI_OLD_MEMMAP))   \
@@ -99,8 +98,7 @@ struct efi_scratch {
efi_switch_mm(efi_scratch.prev_mm); \
\
firmware_restrict_branch_speculation_end(); \
-   __kernel_fpu_end(); \
-   preempt_enable();   \
+   kernel_fpu_end();   \
 })
 
 extern void __iomem *__init efi_ioremap(unsigned long addr, unsigned long size,
diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index a9caac9d4a72..b56d504af654 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -12,17 +12,12 @@
 #define _ASM_X86_FPU_API_H
 
 /*
- * Careful: __kernel_fpu_begin/end() must be called with preempt disabled
- * and they don't touch the preempt state on their own.
- * If you enable preemption after __kernel_fpu_begin(), preempt notifier
- * should call the __kernel_fpu_end() to prevent the kernel/user FPU
- * state from getting corrupted. KVM for example uses this model.
- *
- * All other cases use kernel_fpu_begin/end() which disable preemption
- * during kernel FPU usage.
+ * Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It
+ * disables preemption so be careful if you intend to use it for long periods
+ * of time.
+ * If you intend to use the FPU in softirq you need to check first with
+ * irq_fpu_usable() if it is possible.
  */
-extern void __kernel_fpu_begin(void);
-extern void __kernel_fpu_end(void);
 extern void kernel_fpu_begin(void);
 extern void kernel_fpu_end(void);
 extern bool irq_fpu_usable(void);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2ea85b32421a..2e5003fef51a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -93,7 +93,7 @@ bool irq_fpu_usable(void)
 }
 EXPORT_SYMBOL(irq_fpu_usable);
 
-void __kernel_fpu_begin(void)
+static void __kernel_fpu_begin(void)
 {
struct fpu *fpu = ¤t->thread.fpu;
 
@@ -111,9 +111,8 @@ void __kernel_fpu_begin(void)
__cpu_invalidate_fpregs_state();
}
 }
-EXPORT_SYMBOL(__kernel_fpu_begin);
 
-void __kernel_fpu_end(void)
+static void __kernel_fpu_end(void)
 {
struct fpu *fpu = ¤t->thread.fpu;
 
@@ -122,7 +121,6 @@ void __kernel_fpu_end(void)
 
kernel_fpu_ena

[tip:x86/fpu] x86/fpu: Don't export __kernel_fpu_{begin,end}()

2018-12-03 Thread tip-bot for Sebastian Andrzej Siewior
Commit-ID:  7d79adb87fa79e4a4c59424fd5b5a922861fc5f6
Gitweb: https://git.kernel.org/tip/7d79adb87fa79e4a4c59424fd5b5a922861fc5f6
Author: Sebastian Andrzej Siewior 
AuthorDate: Thu, 29 Nov 2018 16:02:10 +0100
Committer:  Borislav Petkov 
CommitDate: Mon, 3 Dec 2018 19:37:27 +0100

x86/fpu: Don't export __kernel_fpu_{begin,end}()

There is one user of __kernel_fpu_begin() and before invoking it,
it invokes preempt_disable(). So it could invoke kernel_fpu_begin()
right away. The 32bit version of arch_efi_call_virt_setup() and
arch_efi_call_virt_teardown() does this already.

The comment above *kernel_fpu*() claims that before invoking
__kernel_fpu_begin() preemption should be disabled and that KVM is a
good example of doing it. Well, KVM doesn't do that since commit

  f775b13eedee2 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")

so it is not an example anymore.

With EFI gone as the last user of __kernel_fpu_{begin|end}(), both can
be made static and not exported anymore.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Borislav Petkov 
Reviewed-by: Rik van Riel 
Cc: "H. Peter Anvin" 
Cc: "Jason A. Donenfeld" 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Dave Hansen 
Cc: Ingo Molnar 
Cc: Nicolai Stange 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Thomas Gleixner 
Cc: kvm ML 
Cc: linux-efi 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20181129150210.2k4mawt37ow6c...@linutronix.de
---
 arch/x86/include/asm/efi.h |  6 ++
 arch/x86/include/asm/fpu/api.h | 16 ++--
 arch/x86/kernel/fpu/core.c |  6 ++
 3 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index eea40d52ca78..45864898f7e5 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -82,8 +82,7 @@ struct efi_scratch {
 #define arch_efi_call_virt_setup() \
 ({ \
efi_sync_low_kernel_mappings(); \
-   preempt_disable();  \
-   __kernel_fpu_begin();   \
+   kernel_fpu_begin(); \
firmware_restrict_branch_speculation_start();   \
\
if (!efi_enabled(EFI_OLD_MEMMAP))   \
@@ -99,8 +98,7 @@ struct efi_scratch {
efi_switch_mm(efi_scratch.prev_mm); \
\
firmware_restrict_branch_speculation_end(); \
-   __kernel_fpu_end(); \
-   preempt_enable();   \
+   kernel_fpu_end();   \
 })
 
 extern void __iomem *__init efi_ioremap(unsigned long addr, unsigned long size,
diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index a9caac9d4a72..e368d8d94ca6 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -12,17 +12,13 @@
 #define _ASM_X86_FPU_API_H
 
 /*
- * Careful: __kernel_fpu_begin/end() must be called with preempt disabled
- * and they don't touch the preempt state on their own.
- * If you enable preemption after __kernel_fpu_begin(), preempt notifier
- * should call the __kernel_fpu_end() to prevent the kernel/user FPU
- * state from getting corrupted. KVM for example uses this model.
- *
- * All other cases use kernel_fpu_begin/end() which disable preemption
- * during kernel FPU usage.
+ * Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It
+ * disables preemption so be careful if you intend to use it for long periods
+ * of time.
+ * If you intend to use the FPU in softirq you need to check first with
+ * irq_fpu_usable() if it is possible.
+ * Using the FPU in hardirq is not allowed.
  */
-extern void __kernel_fpu_begin(void);
-extern void __kernel_fpu_end(void);
 extern void kernel_fpu_begin(void);
 extern void kernel_fpu_end(void);
 extern bool irq_fpu_usable(void);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2ea85b32421a..2e5003fef51a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -93,7 +93,7 @@ bool irq_fpu_usable(void)
 }
 EXPORT_SYMBOL(irq_fpu_usable);
 
-void __kernel_fpu_begin(void)
+static void __kernel_fpu_begin(void)
 {
struct fpu *fpu = ¤t->thread.fpu;
 
@@ -111,9 +111,8 @@ void __kernel_fpu_begin(void)
__cpu_invalidate_fpregs_state();
}
 }
-EXPORT_SYMBOL(__kernel_fpu_begin);
 
-void __kernel_fpu_end(void)
+static void __kernel_fpu_end(void)
 {
struct fpu *fpu = ¤t->thread.fpu;
 
@@ -122,7 +121,6 @@ void

Re: [PATCH] pstore: Convert buf_lock to semaphore

2018-12-03 Thread Sebastian Andrzej Siewior
On 2018-12-01 09:42:38 [+0100], Arnd Bergmann wrote:
> You are right that you can't take (or release) a mutex from interrupt
> context. However, I don't think converting a spinlock to a semaphore
> is going to help here either.

you can acquire a semaphore with a try_lock from interrupts context but
you can't do that with a mutex. You can also a acquire a semaphore in
one context and release in another. Not that I'm a fan of those things
but those two are often the reasons why people stick with a semaphore.

I haven't looked a general picture yet, will try to do so later today or
tomorrow.

> Arnd

Sebastian


Re: EFI-pstore, sleeping from back context after fault

2018-11-30 Thread Sebastian Andrzej Siewior
On 2018-11-29 13:43:58 [-0800], Kees Cook wrote:
> On Mon, Nov 26, 2018 at 9:04 AM Sebastian Andrzej Siewior
>  wrote:
> This bug got handled by Jann Horn, yes? (I remember seeing a related
> thread go by...)

Correct, fix sits in the tip tree. Nevertheless it was useful to spot
the other thing.

> > This looks like it comes from commit 21b3ddd39fee ("efi: Don't use
> > spinlocks for efi vars") because it replaced a spin_lock() with
> > down_interruptible() in this case. In this case, since pstore_dump()
> > holds psinfo->buf_lock with disabled interrupts we can't block…
> >
> > I have this as a workaround:
> 
> I'm not sure this is correct...
> 
> > diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
> > index 9336ffdf6e2c..d4dcfe46eb2e 100644
> > --- a/drivers/firmware/efi/vars.c
> > +++ b/drivers/firmware/efi/vars.c
> > @@ -755,7 +755,9 @@ int efivar_entry_set_safe(efi_char16_t *name, 
> > efi_guid_t vendor, u32 attributes,
> > return -EINTR;
> > }
> >
> > -   status = check_var_size(attributes, size + ucs2_strsize(name, 
> > 1024));
> > +   status = ops->query_variable_store(attributes,
> > +  size + ucs2_strsize(name, 1024),
> > +  block);
> 
> Why does this change help?

check_var_size() uses false as the argument for `block'. 
check_var_size_nonblocking uses true as the argument for `block'.
Doing this ops->… allows to pass the `block' argument as received from
caller. And the functions above already use the `block' argument. Plus
the next hunk makes sure that `block' is set properly.

> > if (status != EFI_SUCCESS) {
> > up(&efivars_lock);
> > return -ENOSPC;
> > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> > index b821054ca3ed..9aa27a2c8d36 100644
> > --- a/fs/pstore/platform.c
> > +++ b/fs/pstore/platform.c
> > @@ -127,10 +127,10 @@ static const char *get_reason_str(enum 
> > kmsg_dump_reason reason)
> >  bool pstore_cannot_block_path(enum kmsg_dump_reason reason)
> >  {
> > /*
> > -* In case of NMI path, pstore shouldn't be blocked
> > -* regardless of reason.
> > +* In case of disabled preemption / interrupts we can't block on any
> > +* lock regardless of reason.
> >  */
> > -   if (in_nmi())
> > +   if (in_atomic() || irqs_disabled())
> > return true;
> >
> > switch (reason) {
> 
> I'd like to avoid any case where pstore might "miss" a crash report,
> though... this seems to make it easier for it to fail to record a
> crash, yes?

Well, if the lock is occupied, yes. But waiting for the completion with
disabled interrupts isn't much better :)

pstore_cannot_block_path() is used in two places. One caller is taking a
spin_lock(). So that one could keep using the "old" function.

The other caller uses it before acquiring the semaphore. The non-try
version can't be used with preemption or interrupts disabled.

You could split those two cases and let the sempaphire user use the
"(in_atomic() || irqs_disabled())" from above.

Be aware that once you hold the spinlock you can't block and must do the
try_lock for the semaphore. Or switch it back to the spinlock and spin…

> I have some pending clean-ups in this area (buf_lock may not be needed
> at all, actually), so I'll try to work through those too.

Okay. But could we please fix this for stable somehow, too?

> -Kees

Sebastian


EFI-pstore, sleeping from back context after fault

2018-11-26 Thread Sebastian Andrzej Siewior
So I triggered a bug in x86 code. First the "okay" backtrace:
|BUG: GPF in non-whitelisted uaccess (non-canonical address?)
|general protection fault:  [#1] PREEMPT SMP NOPTI
|CPU: 26 PID: 2236 Comm: sig-xstate-bum Not tainted 4.20.0-rc3 #45
|RIP: 0010:__fpu__restore_sig+0x1c1/0x540
|Call Trace:
| fpu__restore_sig+0x28/0x40
| restore_sigcontext+0x13a/0x180
| __ia32_sys_rt_sigreturn+0xae/0x100
| do_syscall_64+0x4f/0x100
| entry_SYSCALL_64_after_hwframe+0x44/0xa9
|RIP: 0033:0x7f9b06aea227
|---[ end trace a45ac23b593e9ab0 ]---

and now the not okay part:

|BUG: sleeping function called from invalid context at 
kernel/sched/completion.c:99
|in_atomic(): 1, irqs_disabled(): 1, pid: 2236, name: sig-xstate-bum
|Preemption disabled at:
|[] pstore_dump+0x72/0x330
|CPU: 26 PID: 2236 Comm: sig-xstate-bum Tainted: G  D   4.20.0-rc3 
#45
|Call Trace:
| dump_stack+0x4f/0x6a
| ___might_sleep.cold.91+0xd3/0xe4
| __might_sleep+0x50/0x90
| wait_for_completion+0x32/0x130
| virt_efi_query_variable_info+0x14e/0x160
| efi_query_variable_store+0x51/0x1a0
| efivar_entry_set_safe+0xa3/0x1b0
| efi_pstore_write+0x109/0x140
| pstore_dump+0x11c/0x330
| kmsg_dump+0xa4/0xd0
| oops_exit+0x22/0x30
| oops_end+0x67/0xd0
| die+0x41/0x4a
| do_general_protection+0xc1/0x150
| general_protection+0x1e/0x30
|RIP: 0010:__fpu__restore_sig+0x1c1/0x540
| fpu__restore_sig+0x28/0x40
| restore_sigcontext+0x13a/0x180
| __ia32_sys_rt_sigreturn+0xae/0x100
| do_syscall_64+0x4f/0x100
| entry_SYSCALL_64_after_hwframe+0x44/0xa9
|RIP: 0033:0x7f9b06aea227
… [ moar backtrace ]

This looks like it comes from commit 21b3ddd39fee ("efi: Don't use
spinlocks for efi vars") because it replaced a spin_lock() with
down_interruptible() in this case. In this case, since pstore_dump()
holds psinfo->buf_lock with disabled interrupts we can't block…

I have this as a workaround:

diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
index 9336ffdf6e2c..d4dcfe46eb2e 100644
--- a/drivers/firmware/efi/vars.c
+++ b/drivers/firmware/efi/vars.c
@@ -755,7 +755,9 @@ int efivar_entry_set_safe(efi_char16_t *name, efi_guid_t 
vendor, u32 attributes,
return -EINTR;
}
 
-   status = check_var_size(attributes, size + ucs2_strsize(name, 1024));
+   status = ops->query_variable_store(attributes,
+  size + ucs2_strsize(name, 1024),
+  block);
if (status != EFI_SUCCESS) {
up(&efivars_lock);
return -ENOSPC;
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index b821054ca3ed..9aa27a2c8d36 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -127,10 +127,10 @@ static const char *get_reason_str(enum kmsg_dump_reason 
reason)
 bool pstore_cannot_block_path(enum kmsg_dump_reason reason)
 {
/*
-* In case of NMI path, pstore shouldn't be blocked
-* regardless of reason.
+* In case of disabled preemption / interrupts we can't block on any
+* lock regardless of reason.
 */
-   if (in_nmi())
+   if (in_atomic() || irqs_disabled())
return true;
 
switch (reason) {

Sebastian


Re: [PATCH RT] rtc: Disable RTC_DRV_EFI on RT

2018-07-26 Thread Sebastian Andrzej Siewior
On 2018-07-26 18:01:47 [+0200], Ard Biesheuvel wrote:
> Yes, that's what I was thinking. This way, you can still reboot into
> the same kernel occasionally with EFI runtime services enabled to,
> e.g., use efibootmgr.
> 
> Acked-by: Ard Biesheuvel 
> 
> for both patches if you queue them in the -rt tree. If you want them
> in mainline, please resend them as proper patches once
> CONFIG_PREEMPT_RT_BASE has been declared in mainline as well.
Okay, thank you.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RT] rtc: Disable RTC_DRV_EFI on RT

2018-07-26 Thread Sebastian Andrzej Siewior
On 2018-07-26 15:13:23 [+0200], To Ard Biesheuvel wrote:
> On 2018-07-26 14:52:21 [+0200], Ard Biesheuvel wrote:
> > We could also make it the default on -rt, but not disable it entirely, so 
> > that efi=runtime can be used to re-enable it.
> 
> Oh. I like that. We have something similar for RCU. So I would need
> that:
and then I could make it default off:

- >8

Subject: [PATCH] efi: Disable runtime services on RT

Based on meassurements the EFI functions get_variable /
get_next_variable take up to 2us which looks okay.
The functions get_time, set_time take around 10ms. Those 10ms are too
much. Even one ms would be too much.
Ard mentioned that SetVariable might even trigger larger latencies if
the firware will erase flash blocks on NOR.

The time-functions are used by efi-rtc and can be triggered during
runtimed (either via explicit read/write or ntp sync).

The variable write could be used by pstore.
These functions can be disabled without much of a loss. The poweroff /
reboot hooks may be provided by PSCI.

Disable EFI's runtime wrappers.

This was observed on "EFI v2.60 by SoftIron Overdrive 1000".

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/firmware/efi/efi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 232f4915223b..62c6e4b6ce3e 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -84,7 +84,7 @@ struct mm_struct efi_mm = {
.mmlist = LIST_HEAD_INIT(efi_mm.mmlist),
 };
 
-static bool disable_runtime;
+static bool disable_runtime = IS_ENABLED(CONFIG_PREEMPT_RT_BASE);
 static int __init setup_noefi(char *arg)
 {
disable_runtime = true;
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RT] rtc: Disable RTC_DRV_EFI on RT

2018-07-26 Thread Sebastian Andrzej Siewior
On 2018-07-26 14:52:21 [+0200], Ard Biesheuvel wrote:
> We could also make it the default on -rt, but not disable it entirely, so 
> that efi=runtime can be used to re-enable it.

Oh. I like that. We have something similar for RCU. So I would need
that:

--- >8

Subject: [PATCH] efi: Allow efi=runtime

In case the option "efi=noruntime" is default at built-time, the user
could overwrite its sate by `efi=runtime' and allow it again.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/firmware/efi/efi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 62c6e4b6ce3e..d6176ce50b45 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -110,6 +110,9 @@ static int __init parse_efi_cmdline(char *str)
if (parse_option_str(str, "noruntime"))
disable_runtime = true;
 
+   if (parse_option_str(str, "runtime"))
+   disable_runtime = false;
+
return 0;
 }
 early_param("efi", parse_efi_cmdline);
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RT] rtc: Disable RTC_DRV_EFI on RT

2018-07-26 Thread Sebastian Andrzej Siewior
On 2018-07-26 14:26:25 [+0200], Ard Biesheuvel wrote:
> What i mean is whatever you end up with if you pass efi=noruntime to the 
> kernel.
ah. So I wouldn't have to patch this, just document it. That might work.

> But as i mentioned before, you may also lose the ability to reboot/shut down 
> your system this way. Can you please double check?

I checked before posting the patch. psci is used instead (on softiron
here, have nothing else). psci_sys_poweroff() works.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RT] rtc: Disable RTC_DRV_EFI on RT

2018-07-26 Thread Sebastian Andrzej Siewior
On 2018-07-26 11:15:52 [+0200], Ard Biesheuvel wrote:
> On 26 July 2018 at 11:04, Sebastian Andrzej Siewior
>  wrote:
> > Based on measurements the EFI functions get_variable /
> > get_next_variable take up to 2us. The functions get_time, set_time take
> > around 10ms. Those 10ms are too much. Even one ms would be too much.
> 
> You cannot extrapolate from these numbers. If the lack of worst case
> guarantees on an EFI system is a problem for -rt, the only meaningful
> course of action is to disable EFI runtime services entirely.

Oh boy. I disable EFI entirely because then the bootloader won't boot
due to missing efi-stub. But I could disable the runtime-wrappers if
that is what you meant. Patch below.

> I think SetVariable() may be even worse than GetTime(), given that NOR
> flash updates may involve erasing blocks.

Here we go. "Read" can be triggered from userland via sysfs / efivars.
"Write" is (currently) limited to pstore?

> > The funny part is that EFI is invoked with enabled interrupts.
> > According to my tracing I see the interrupt almost 10ms later which
> > indicates that EFI is disabling interrupts while doing its thing.
> 
> Yes, and this is also highly implementation specific. Basing this kind
> of policy on a single implementation is not very wise imo.

even if interrupts were not disabled then there would be no context
switch on exit from interrupt path due to the preempt-disable part. So
no improvement.

> > This was observed on "EFI v2.60 by SoftIron Overdrive 1000". I don't say
> > that every EFI implementation does this but given that it has to access a
> > slow bus like I2C/SPI it is expected.
> >
> > Disable EFI's RTC driver on RT.
> >
> 
> Other calls to GetTime() would still be permitted in this case, so
> this seems like a partial solution (although no other calls seem to
> exist atm)

I *assumed* pstore would use the SetVariable during warning/bug. The
pstore thingy might be useful. Not sure.

-- >8

Subject: [PATCH RT] arm64: Disable EFI runtime services on RT

Based on meassurements the EFI functions get_variable /
get_next_variable take up to 2us which looks okay.
The functions get_time, set_time take around 10ms. Those 10ms are too
much. Even one ms would be too much.
Ard mentioned that SetVariable might even trigger larger latencies if
the firware will erase flash blocks on NOR.

The time-functions are used by efi-rtc and can be triggered during
runtimed (either via explicit read/write or ntp sync).

The variable write could be used by pstore.
These functions can be disabled without much of a loss. The poweroff /
reboot hooks will be provided by PSCI.

Disable EFI's runtime wrappers.

This was observed on "EFI v2.60 by SoftIron Overdrive 1000".

Signed-off-by: Sebastian Andrzej Siewior 
---
 arch/arm64/Kconfig | 2 +-
 drivers/firmware/efi/arm-runtime.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 111db4e44fcf..68adcb0f4de6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1250,7 +1250,7 @@ config EFI
select LIBFDT
select UCS2_STRING
select EFI_PARAMS_FROM_FDT
-   select EFI_RUNTIME_WRAPPERS
+   select EFI_RUNTIME_WRAPPERS if !PREEMPT_RT_BASE
select EFI_STUB
select EFI_ARMSTUB
default y
diff --git a/drivers/firmware/efi/arm-runtime.c 
b/drivers/firmware/efi/arm-runtime.c
index 5889cbea60b8..3e96db10c034 100644
--- a/drivers/firmware/efi/arm-runtime.c
+++ b/drivers/firmware/efi/arm-runtime.c
@@ -140,8 +140,10 @@ static int __init arm_enable_runtime_services(void)
}
 
/* Set up runtime services function pointers */
+#ifdef CONFIG_EFI_RUNTIME_WRAPPERS
efi_native_runtime_setup();
set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
+#endif
 
return 0;
 }
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RT] rtc: Disable RTC_DRV_EFI on RT

2018-07-26 Thread Sebastian Andrzej Siewior
Based on measurements the EFI functions get_variable /
get_next_variable take up to 2us. The functions get_time, set_time take
around 10ms. Those 10ms are too much. Even one ms would be too much.
The funny part is that EFI is invoked with enabled interrupts.
According to my tracing I see the interrupt almost 10ms later which
indicates that EFI is disabling interrupts while doing its thing.

This was observed on "EFI v2.60 by SoftIron Overdrive 1000". I don't say
that every EFI implementation does this but given that it has to access a
slow bus like I2C/SPI it is expected.

Disable EFI's RTC driver on RT.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/rtc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig
index a2ba5db36145..248d72711650 100644
--- a/drivers/rtc/Kconfig
+++ b/drivers/rtc/Kconfig
@@ -1087,7 +1087,7 @@ config RTC_DRV_DA9063
 
 config RTC_DRV_EFI
tristate "EFI RTC"
-   depends on EFI && !X86
+   depends on EFI && !X86 && !PREEMPT_RT_BASE
help
  If you say yes here you will get support for the EFI
  Real Time Clock.
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
efi_switch_mm() is a wrapper around switch_mm() which saves current's
->active_mm, sets the requests mm as ->active_mm and invokes
switch_mm().
I don't think that task_lock() is required during that procedure. It
protects ->mm which isn't changed here.

It needs to be mentioned that during the whole procedure (switch to
EFI's mm and back) the preemption needs to be disabled. A context switch
at this point would reset the cr3 value based on current->mm. Also, this
function may not be invoked at the same time on a different CPU because
it would overwrite the efi_scratch.prev_mm information.

Remove task_lock() and also update the comment to reflect it.

Signed-off-by: Sebastian Andrzej Siewior 
---
v1…v2:
- drop RFC
- don't remove assignment of ->active_mm (PeterZ)
- don't worry about concurrent invocations of the function, Ard
  said that the "mixed-mode" version has a lock now.

 arch/x86/platform/efi/efi_64.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 448267f1c073..6ab1fdeefa1a 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -619,18 +619,16 @@ void __init efi_dump_pagetable(void)
 
 /*
  * Makes the calling thread switch to/from efi_mm context. Can be used
- * for SetVirtualAddressMap() i.e. current->active_mm == init_mm as well
- * as during efi runtime calls i.e current->active_mm == current_mm.
- * We are not mm_dropping()/mm_grabbing() any mm, because we are not
- * losing/creating any references.
+ * in a kernel thread and user context. Preemption needs to remain disabled
+ * while the EFI-mm is borrowed. mmgrab()/mmdrop() is not used because the mm
+ * can not change under us.
+ * It should be ensured that there are no concurent calls to this function.
  */
 void efi_switch_mm(struct mm_struct *mm)
 {
-   task_lock(current);
efi_scratch.prev_mm = current->active_mm;
current->active_mm = mm;
switch_mm(efi_scratch.prev_mm, mm, NULL);
-   task_unlock(current);
 }
 
 #ifdef CONFIG_EFI_MIXED
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
On 2018-07-24 18:19:15 [+0200], Ard Biesheuvel wrote:
> > Regarding the workqueue in commit 3eb420e70d87 ("efi: Use a work queue
> > to invoke EFI Runtime Services"). The efi_call_virt_pointer() function
> > uses local_save_flags() while invoking the EFI function. Why does commit
> > message say "Since UEFI runtime services are typically invoked with
> > interrupts enabled,"?
> >
> 
> Because local_save_flags() does not disable interrupts??

now that you say so, it does make sense…

> > Anyway, I would still like to get rid of task_lock() in efi_switch_mm().
> > Any objections to that?
> 
> No, not at all.
okay.

> > If efi_switch_mm() is only invoked from kernel-thread (the mixed-mode
> > caller does not) the then you could use use_mm() / unuse_mm() instead.
> > Then I would be fine with task_lock() (but it would have to be moved to
> > the preemptible section (after efi_sync_low_kernel_mappings()).
> 
> There are some exceptions, unfortunately, although I don't think they
> matter to -rt: at early boot time and panic() time, alternative
> wrappers are used that bypass the work queue. The reset_system()
> service also bypasses the work queue but that does not really matter
> given the purpose of the call.
If you lose the task's real mm in panic() then it probably does not
matter since you are never going back to userland.
But the others might be invoked from user context as there are two
non-blocking calls. So no use_mm() then.

reset_system() service should not return but it may if a signal is
pending (still there is a fallback but wouldn't it make sense to drop
that interruptible part)? Also this might be called from
emergency_restart() / panic() right? So down() will open interrupts and
schedule() if the lock is contended.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
On 2018-07-24 17:32:14 [+0200], Ard Biesheuvel wrote:
> Please refer to what has been queued up in tip:efi/core. Sai has
> implemented a work queue for EFI calls so they occur from a kernel
> thread, and the mixed mode locking has been fixed as well.

I see commit 83a0a2ea0b99 ("efi/x86: Prevent reentrant firmware calls in
mixed mode") which fixes the locking issue I mentioned. Will this make
its way to the current kernel?

Regarding the workqueue in commit 3eb420e70d87 ("efi: Use a work queue
to invoke EFI Runtime Services"). The efi_call_virt_pointer() function
uses local_save_flags() while invoking the EFI function. Why does commit
message say "Since UEFI runtime services are typically invoked with
interrupts enabled,"?

Anyway, I would still like to get rid of task_lock() in efi_switch_mm().
Any objections to that?
If efi_switch_mm() is only invoked from kernel-thread (the mixed-mode
caller does not) the then you could use use_mm() / unuse_mm() instead.
Then I would be fine with task_lock() (but it would have to be moved to
the preemptible section (after efi_sync_low_kernel_mappings()).

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
On 2018-07-24 17:00:09 [+0200], Peter Zijlstra wrote:
> On Tue, Jul 24, 2018 at 04:35:09PM +0200, Sebastian Andrzej Siewior wrote:
> > I doubt that there any need to set ->active_mm. It is used by the
> > scheduler to keep track of the "currently used mm" so it can reuse one
> > for the kernel thread which does not own one and take a reference on it
> > so it does not go away while the thread (that borrows it) is active.
> 
> >  void efi_switch_mm(struct mm_struct *mm)
> >  {
> > -   task_lock(current);
> > efi_scratch.prev_mm = current->active_mm;
> > -   current->active_mm = mm;
> > switch_mm(efi_scratch.prev_mm, mm, NULL);
> > -   task_unlock(current);
> >  }
> 
> I think that's broken. Take for instance stuff like
> perf_callchain_user32() -> get_segment_base(). That looks at active_mm
> to get at the current LDT.

right. I saw that briefly not sure why I dropped it. I have no idea
where the LDT points to but it probably sense to return EFI's version of
it.

> Now, I'm not saying the whole perf vs EFI thing isn't already terminally
> wrecked, but the rule is that active_mm really should point at the
> current active mm, and the above breaks that.
Right. Even if we not perform a context switch. Okay. Will update that
part.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] x86/efi: drop task_lock() from efi_switch_mm()

2018-07-24 Thread Sebastian Andrzej Siewior
During invocations of EFI functions efi_switch_mm() is used to set the
active mm to borrow a different MM for a while. This used for instance
by efi_call_virt_pointer():
efi_call_virt_pointer()
- arch_efi_call_virt_setup()
  - preempt_disable()
  - efi_switch_mm(&efi_mm)

- "EFI CALL"

- arch_efi_call_virt_teardown()
  - efi_switch_mm(efi_scratch.prev_mm)
  - preempt_enable()

efi_switch_mm() is a wrapper around switch_mm() which saves current's
->active_mm, sets the requests mm as ->active_mm and invokes
switch_mm().
I don't think that task_lock() is required during that procedure. It
protects ->mm which isn't changed here.
I doubt that there any need to set ->active_mm. It is used by the
scheduler to keep track of the "currently used mm" so it can reuse one
for the kernel thread which does not own one and take a reference on it
so it does not go away while the thread (that borrows it) is active.
The `->active_mm' that is used here needs to saved to we can use it in
the "ending" efi_switch_to() again.
Based on this, the ->active_mm assignment can go.
It needs to be mentioned that during the whole procedure (switch to
EFI's mm and back) the preemption needs to be disabled. A context switch
at this point would reset the cr3 value based on current->mm.

With this new information, also update the comment to reflect it.

The "efi_scratch.prev_mm" assignment looks racy. With two concurrent
efi.get_next_variable invocations, one would overrwrite the value from
the other. This however does not happen because "runtime-wrappers"
(virt_efi_get_next_variable()) use efi_runtime_lock during the
invocation. This is not the case for "efi_64"
(efi_thunk_get_next_variable()) which does not take any locks (or I
missed them). Maybe it would be better to let the caller save
->active_mm while the MM is borrowed.

Signed-off-by: Sebastian Andrzej Siewior 
---
 arch/x86/platform/efi/efi_64.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 77873ce700ae..64bf29462348 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -619,18 +619,16 @@ void __init efi_dump_pagetable(void)
 
 /*
  * Makes the calling thread switch to/from efi_mm context. Can be used
- * for SetVirtualAddressMap() i.e. current->active_mm == init_mm as well
- * as during efi runtime calls i.e current->active_mm == current_mm.
+ * in a kernel thread and user context. Preemption needs to remain disabled
+ * while the EFI-mm is borrowed.
  * We are not mm_dropping()/mm_grabbing() any mm, because we are not
- * losing/creating any references.
+ * losing/creating any references. Preemption must be disabled while the mm is
+ * switched.
  */
 void efi_switch_mm(struct mm_struct *mm)
 {
-   task_lock(current);
efi_scratch.prev_mm = current->active_mm;
-   current->active_mm = mm;
switch_mm(efi_scratch.prev_mm, mm, NULL);
-   task_unlock(current);
 }
 
 #ifdef CONFIG_EFI_MIXED
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] x86/mm, efi: Check for valid image type

2015-07-30 Thread Sebastian Andrzej Siewior
On 07/29/2015 06:41 PM, Josh Triplett wrote:

>> This is correct. However I miss the point of saving the image in the
>> first place. From what I see is that I have now 272 KiB in memory which
>> are never used again. Is there a usecase why we have it? From the code
>> it looks like we save it during boot and make it available later via
>> sysfs.
> 
> That's the point, yes.  A splash screen or an about box can then read it
> from there later and display it to the user.
> 
>  arch/x86/platform/efi/efi-bgrt.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/platform/efi/efi-bgrt.c 
> b/arch/x86/platform/efi/efi-bgrt.c
> index d7f997f7c26d..59710f0875bb 100644
> --- a/arch/x86/platform/efi/efi-bgrt.c
> +++ b/arch/x86/platform/efi/efi-bgrt.c
> @@ -79,6 +79,10 @@ void __init efi_bgrt_init(void)
>   memcpy_fromio(&bmp_header, image, sizeof(bmp_header));
>   if (ioremapped)
>   early_iounmap(image, sizeof(bmp_header));
> + if (bmp_header.id != 0x4d42) {
> + pr_err("BGRT: Not a valid BMP file.\n");
> + return;
> + }
>>>
>>> That's a good idea as an additional cross-check, but not a complete fix
>>> for the problem.
>>
>> But it is one additional check to make sure we look at proper data. The
>> (ACPI-table) header has an image type which is to BMP (the only
>> currently support image type) so this is the double check.
> 
> I agree completely with adding the check; I'm just saying it isn't a
> complete fix.

okay, so how do we continue here? You ack that one  and send the other
patch on top or do you want first that kexec patch and then the BMP
check?

>> And the
>> kernel should be able to read from any address as long as it is within
>> its DRAM.
> 
> Then what caused the oops on early_ioremap?

from the WARN_ON() in ioremap:

 /*
  * Mappings have to fit in the FIX_BTMAP area.
  */
 nrpages = size >> PAGE_SHIFT;
 if (WARN_ON(nrpages > NR_FIX_BTMAPS))
 return NULL;

This means the size of the ioremap is limited to 256KiB. So lets say we
get 300KiB as the image size: we managed to allocate say 300KiB via
kmalloc() and later we get the warn_on() here because we can't remap
more than 256KiB.

So we can ignore this until a BIOS includes a real image with a size >
256KiB. Another way around it would be get memblock to ignore this
region and give it back later.

> That sounds like a sensible fix; don't try to parse the BGRT if
> efi_setup.

okay.

> - Josh Triplett

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] x86/mm, efi: Check for valid image type

2015-07-29 Thread Sebastian Andrzej Siewior
On 07/29/2015 02:10 AM, j...@joshtriplett.org wrote:
>> On Wed, 22 Jul, at 05:32:44PM, Sebastian Andrzej Siewior wrote:
>>> now and then. The data behind that pointer changes on each boot because
>>> nobody preserves the content across kexec.
> 
> Right.  The kernel copies this image precisely because it may go away at
> ExitBootServices or similar, or when ACPI reclaim space is freed.  This
> ties into the various work about trying to make kexec handle EFI and
> ACPI.  Is there some way we can indicate to the kexec kernel that this
> won't work?  (Or fix it so it will?)

>From what I know kexec entry is transparent. Could you remove the table
from ACPI? There is no point on having this bitmap information now,
right? The last way out would be to patch kexec and let it read the
table from /sys and put it at the spot mentioned in the ACPI table.

>>> Signed-off-by: Sebastian Andrzej Siewior 
>>> ---
>>>
>>> I don't know much about the requirement of having the .bmp in memory all the
>>> time. Would it be a bad thing to compress the bmp and uncompress on cat from
>>> userland? In my case the bmp has 272 KiB and LZO gets it down to 12KiB,
>>> XZ 7.4KiB.
>>
>> The usual use for BGRT is to display it during kernel boot, so
>> interacting with userland doesn't help you there.
> 
> If we're going to be storing a large image, applying simple in-kernel
> compression doesn't seem unreasonable, if we then decompress it when
> read from the BGRT sysfs file.  That's entirely separate from this issue
> though.

This is correct. However I miss the point of saving the image in the
first place. From what I see is that I have now 272 KiB in memory which
are never used again. Is there a usecase why we have it? From the code
it looks like we save it during boot and make it available later via
sysfs.

>>>  arch/x86/platform/efi/efi-bgrt.c | 4 
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/arch/x86/platform/efi/efi-bgrt.c 
>>> b/arch/x86/platform/efi/efi-bgrt.c
>>> index d7f997f7c26d..59710f0875bb 100644
>>> --- a/arch/x86/platform/efi/efi-bgrt.c
>>> +++ b/arch/x86/platform/efi/efi-bgrt.c
>>> @@ -79,6 +79,10 @@ void __init efi_bgrt_init(void)
>>> memcpy_fromio(&bmp_header, image, sizeof(bmp_header));
>>> if (ioremapped)
>>> early_iounmap(image, sizeof(bmp_header));
>>> +   if (bmp_header.id != 0x4d42) {
>>> +   pr_err("BGRT: Not a valid BMP file.\n");
>>> +   return;
>>> +   }
> 
> That's a good idea as an additional cross-check, but not a complete fix
> for the problem.

But it is one additional check to make sure we look at proper data. The
(ACPI-table) header has an image type which is to BMP (the only
currently support image type) so this is the double check. And the
kernel should be able to read from any address as long as it is within
its DRAM.

>  As you pointed out above, a wild pointer could cause a
> WARN from early_ioremap.  We need to never follow the pointer in the
> first place after a kexec, unless we have some way to know that it's
> actually valid.

So you assume that the information from ACPI is always correct then?
The pointer is correct, the information it points to is no longer.

If we run always under EFI then it looks like the variable efi_setup
which is checked in efi_enter_virtual_mode() is 0 during normal boot
and != 0 on kexec entry. This hint is set via setup_data / SETUP_EFI
since commit 1fec053369 ("x86/efi: Pass necessary EFI data for kexec
via setup_data"). So maybe we could use this to check if we run under
kexec or not.

> - Josh Triplett

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] x86/mm, efi: Check for valid image type

2015-07-22 Thread Sebastian Andrzej Siewior
I usually see
|Ignoring BGRT: failed to allocate memory for image (wanted 264301314 bytes)
|Ignoring BGRT: failed to allocate memory for image (wanted 3925872891 bytes)

sometimes I get

|[ cut here ]
|WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:136 
__early_ioremap.constprop.0+0x113/0x1d3()
…
| [] __early_ioremap.constprop.0+0x113/0x1d3
| [] early_ioremap+0x13/0x15
| [] efi_bgrt_init+0x1e2/0x27d
…

now and then. The data behind that pointer changes on each boot because
nobody preserves the content across kexec.

Signed-off-by: Sebastian Andrzej Siewior 
---

I don't know much about the requirement of having the .bmp in memory all the
time. Would it be a bad thing to compress the bmp and uncompress on cat from
userland? In my case the bmp has 272 KiB and LZO gets it down to 12KiB,
XZ 7.4KiB.

 arch/x86/platform/efi/efi-bgrt.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/platform/efi/efi-bgrt.c b/arch/x86/platform/efi/efi-bgrt.c
index d7f997f7c26d..59710f0875bb 100644
--- a/arch/x86/platform/efi/efi-bgrt.c
+++ b/arch/x86/platform/efi/efi-bgrt.c
@@ -79,6 +79,10 @@ void __init efi_bgrt_init(void)
memcpy_fromio(&bmp_header, image, sizeof(bmp_header));
if (ioremapped)
early_iounmap(image, sizeof(bmp_header));
+   if (bmp_header.id != 0x4d42) {
+   pr_err("BGRT: Not a valid BMP file.\n");
+   return;
+   }
bgrt_image_size = bmp_header.size;
 
bgrt_image = kmalloc(bgrt_image_size, GFP_KERNEL | __GFP_NOWARN);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html