Re: [PATCH] kexec: Fix reboot race during device_shutdown()

2023-10-11 Thread Joel Fernandes
On Tue, Oct 10, 2023 at 5:08 PM Eric W. Biederman  wrote:
>
> Joel Fernandes  writes:
[...]
> >> That way you can get the orderly shutdown
> >> of userspace daemons/services along with an orderly shutdown of
> >> everything the kernel is responsible for.
> >
> > Fixing in userspace is an option but people are not happy that the
> > kernel can crash like that.
>
> In a kexec on panic scenario the kernel needs to perform that absolute
> bare essential shutdown before calling kexec (basically nothing).
> During kexec-on-panic nothing can be relied upon because we don't know
> what is broken.  If that is what you care about (as suggested by the
> unit test) you need to fix the device initialization.
>
> In a normal kexec scenario the whole normal reboot process is expected.
> I have no problems with fixing the kernel to handle that scenario,
> but in the real world the entire orderly shutdown both, kernel
> and userspace should be performed.

Sounds good. Since you mentioned you have no problem with fixing
regular reboot in the kernel, we will work on reproducing the issue
with regular reboot as well and fix that.

I think a syscall causing the kernel to crash instead of operate
normally is a cause of concern, so let us fix the kernel as well
(other than improving the test case as you mentioned).

> >> At the kernel level a kexec reboot and a normal reboot have been
> >> deliberately kept as close as possible.  Which is why I say we should
> >> fix it in reboot.
> >
> > You mean fix it in userspace?
>
> No.  I mean in the kernel the orderly shutdown for a kexec reboot and an
> ordinary reboot are kept as close to the same as possible.
>
> It should be the case that the only differences between the two is that
> in once case system firmware takes over after the orderly shutdown,
> and in the other case a new kernel takes over after the orderly shutdown.

Agreed.

thanks,

 - Joel

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec: Fix reboot race during device_shutdown()

2023-10-10 Thread Joel Fernandes
On Mon, Oct 9, 2023 at 10:00 AM Steven Rostedt  wrote:
>
> On Sat, 7 Oct 2023 21:30:42 -0400
> Joel Fernandes  wrote:
>
> > Just checking how we want to proceed, is the consensus that we should
> > prevent kernel crashes without relying on userspace stopping all
> > processes? Should we fix regular reboot syscall as well and not just
> > kexec reboot?
>
> If you can show that we can trigger the crash on normal reboot, then I
> don't see why not. That is, if you have a program that does the reboot
> (without the SIGSTOP/SIGKILL calls) and triggers this crash, I think that's
> a legitimate reason to fix it on normal reboot too.

Ok, Sounds good, thanks for sharing your thoughts.

 - Joel

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec: Fix reboot race during device_shutdown()

2023-10-10 Thread Joel Fernandes
On Mon, Oct 9, 2023 at 11:30 AM Eric W. Biederman  wrote:
>
> Joel Fernandes  writes:
>
> > On Mon, Oct 2, 2023 at 2:18 PM Joel Fernandes  
> > wrote:
> > [..]
> >> > > Such freezing is already being done if kernel supports KEXEC_JUMP and
> >> > > kexec_image->preserve_context is true. However, doing it if either of 
> >> > > these are
> >> > > not true prevents crashes/races.
> >> >
> >> > The KEXEC_JUMP case is something else entirely.  It is supposed to work
> >> > like suspend to RAM.  Maybe reboot should as well, but I am
> >> > uncomfortable making a generic device fix kexec specific.
> >>
> >> I see your point of view. I think regular reboot should also be fixed
> >> to avoid similar crash possibilities. I am happy to make a change for
> >> that similar to this patch if we want to proceed that way.
> >>
> >> Thoughts?
> >
> > Just checking how we want to proceed, is the consensus that we should
> > prevent kernel crashes without relying on userspace stopping all
> > processes? Should we fix regular reboot syscall as well and not just
> > kexec reboot?
>
> It just occurred to me there is something very fishy about all of this.
>
> What userspace do you have using kexec (not kexec on panic) that doesn't
> preform the same userspace shutdown as a normal reboot?
>
> Quite frankly such a userspace is buggy, and arguably that is where you
> should start fixing things.

It is a simple unit test that tests kexec support by kexec-rebooting
the kernel. I don't think SIGSTOP/SIGKILL'ing during kexec-reboot is
ideal because in a real panic-on-kexec type crash, that may not happen
and so does not emulate the real world that well. I think we want the
kexec-reboot to do a *reboot* without crashing the kernel while doing
so. Ricardo/Steve can chime on what they feel as well.

> That way you can get the orderly shutdown
> of userspace daemons/services along with an orderly shutdown of
> everything the kernel is responsible for.

Fixing in userspace is an option but people are not happy that the
kernel can crash like that.

> At the kernel level a kexec reboot and a normal reboot have been
> deliberately kept as close as possible.  Which is why I say we should
> fix it in reboot.

You mean fix it in userspace?

thanks,

 - Joel

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec: Fix reboot race during device_shutdown()

2023-10-07 Thread Joel Fernandes
On Mon, Oct 2, 2023 at 2:18 PM Joel Fernandes  wrote:
[..]
> > > Such freezing is already being done if kernel supports KEXEC_JUMP and
> > > kexec_image->preserve_context is true. However, doing it if either of 
> > > these are
> > > not true prevents crashes/races.
> >
> > The KEXEC_JUMP case is something else entirely.  It is supposed to work
> > like suspend to RAM.  Maybe reboot should as well, but I am
> > uncomfortable making a generic device fix kexec specific.
>
> I see your point of view. I think regular reboot should also be fixed
> to avoid similar crash possibilities. I am happy to make a change for
> that similar to this patch if we want to proceed that way.
>
> Thoughts?

Just checking how we want to proceed, is the consensus that we should
prevent kernel crashes without relying on userspace stopping all
processes? Should we fix regular reboot syscall as well and not just
kexec reboot?

thanks,

 - Joel

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec: Fix reboot race during device_shutdown()

2023-10-02 Thread Joel Fernandes
Hi Eric,

On Fri, Sep 29, 2023 at 12:01 PM Eric W. Biederman
 wrote:
>
> "Joel Fernandes (Google)"  writes:
>
> > During kexec reboot, it is possible for a race to occur between
> > device_shutdown() and userspace.  This causes accesses to GPU after 
> > pm_runtime
> > suspend has already happened. Fix this by calling freeze_processes() before
> > device_shutdown().
>
> Is there any reason why this same race with between sys_kexec and the
> adreno_ioctl can not happen during a normal reboot?

Thanks for the response. It can happen during a normal reboot. I think
the reason it does not show up in the wild is because the "reboot"
command implementation typically sends one of SIGSTOP or SIGKILL to
all processes which effectively prevents the race.

In any case, there is also a school of thought that says the kernel
should be resilient to crashes and a userspace workaround involving
sending signals could be looked at as papering over the real issue. I
do sympathize/agree with that school of thought as well.

> Is there any reason why there is not a .shutdown method to prevent the
> race?
> I would think the thing to do is to prevent this race in
> kernel_restart_prepare or in the GPUs .shutdown method.  As I don't see
> anything that would prevent this during a normal reboot.

What you're saying is essentially what I remember trying, the issue is
not in the GPU driver but rather there the interconnect in the SoC
shutdown and causes an "SError" exception if the CPU tries to access
the memory locations, as also seen in the stack. I was not able to
trace exactly when the interconnect becomes unavailable and perhaps
there is a possibility of a more intricate fix where we can signal the
GPU driver to not access the bus anymore, but my suspicion is that
will add a lot of complexity and perhaps leave the door open to
similar issues.

> > Such freezing is already being done if kernel supports KEXEC_JUMP and
> > kexec_image->preserve_context is true. However, doing it if either of these 
> > are
> > not true prevents crashes/races.
>
> The KEXEC_JUMP case is something else entirely.  It is supposed to work
> like suspend to RAM.  Maybe reboot should as well, but I am
> uncomfortable making a generic device fix kexec specific.

I see your point of view. I think regular reboot should also be fixed
to avoid similar crash possibilities. I am happy to make a change for
that similar to this patch if we want to proceed that way.

Thoughts?

thanks,

 - Joel

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH] kexec: Fix reboot race during device_shutdown()

2023-09-28 Thread Joel Fernandes (Google)
During kexec reboot, it is possible for a race to occur between
device_shutdown() and userspace.  This causes accesses to GPU after pm_runtime
suspend has already happened. Fix this by calling freeze_processes() before
device_shutdown().

Such freezing is already being done if kernel supports KEXEC_JUMP and
kexec_image->preserve_context is true. However, doing it if either of these are
not true prevents crashes/races.

This fixes the following crash during kexec reboot on an ARM64 device
with adreno GPU:

[  292.534314] Kernel panic - not syncing: Asynchronous SError Interrupt
[  292.534323] Hardware name: Google Lazor (rev3 - 8) with LTE (DT)
[  292.534326] Call trace:
[  292.534328]  dump_backtrace+0x0/0x1d4
[  292.534337]  show_stack+0x20/0x2c
[  292.534342]  dump_stack_lvl+0x60/0x78
[  292.534347]  dump_stack+0x18/0x38
[  292.534352]  panic+0x148/0x3b0
[  292.534357]  nmi_panic+0x80/0x94
[  292.534364]  arm64_serror_panic+0x70/0x7c
[  292.534369]  do_serror+0x0/0x7c
[  292.534372]  do_serror+0x54/0x7c
[  292.534377]  el1h_64_error_handler+0x34/0x4c
[  292.534381]  el1h_64_error+0x7c/0x80
[  292.534386]  el1_interrupt+0x20/0x58
[  292.534389]  el1h_64_irq_handler+0x18/0x24
[  292.534395]  el1h_64_irq+0x7c/0x80
[  292.534399]  local_daif_inherit+0x10/0x18
[  292.534405]  el1h_64_sync_handler+0x48/0xb4
[  292.534410]  el1h_64_sync+0x7c/0x80
[  292.534414]  a6xx_gmu_set_oob+0xbc/0x1fc
[  292.534422]  a6xx_get_timestamp+0x40/0xb4
[  292.534426]  adreno_get_param+0x12c/0x1e0
[  292.534433]  msm_ioctl_get_param+0x64/0x70
[  292.534440]  drm_ioctl_kernel+0xe8/0x158
[  292.534448]  drm_ioctl+0x208/0x320
[  292.534453]  __arm64_sys_ioctl+0x98/0xd0
[  292.534461]  invoke_syscall+0x4c/0x118

Cc: Steven Rostedt 
Cc: Ricardo Ribalda 
Cc: Ross Zwisler 
Cc: Rob Clark 
Cc: Linus Torvalds 
Tested-by: Ricardo Ribalda 
Signed-off-by: Joel Fernandes (Google) 
---
 kernel/kexec_core.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index e2f2574d8b74..6599f485e42d 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1299,6 +1299,12 @@ int kernel_kexec(void)
} else
 #endif
{
+   error = freeze_processes();
+   if (error) {
+   error = -EBUSY;
+   goto Unlock;
+   }
+
kexec_in_progress = true;
kernel_restart_prepare("kexec reboot");
migrate_to_reboot_cpu();
-- 
2.42.0.582.g8ccd20d70d-goog


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 3/3] kexec: Introduce paramters load_limit_reboot and load_limit_panic

2022-12-15 Thread Joel Fernandes
Hi Ricardo,

On Thu, Dec 08, 2022 at 05:38:02PM +0100, Ricardo Ribalda wrote:
> Add two parameter to specify how many times a kexec kernel can be loaded.
> 
> The sysadmin can set different limits for kexec panic and kexec reboot
> kernels.
> 
> The value can be modified at runtime via sysfs, but only with a value
> smaller than the current one (except -1).
> 
> Signed-off-by: Ricardo Ribalda 
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 14 
>  include/linux/kexec.h   |  2 +-
>  kernel/kexec.c  |  2 +-
>  kernel/kexec_core.c | 91 
> -
>  kernel/kexec_file.c |  2 +-
>  5 files changed, 106 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index 42af9ca0127e..2b37d6a20747 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2374,6 +2374,20 @@
>   for Movable pages.  "nn[KMGTPE]", "nn%", and "mirror"
>   are exclusive, so you cannot specify multiple forms.
>  
> + kexec_core.load_limit_reboot=
> + kexec_core.load_limit_panic=
> + [KNL]
> + This parameter specifies a limit to the number of times
> + a kexec kernel can be loaded.
> + Format: 
> + -1  = Unlimited.
> + int = Number of times kexec can be called.
> +
> + During runtime, this parameter can be modified with a
> + value smaller than the current one (but not -1).
> +
> + Default: -1
> +
>   kgdbdbgp=   [KGDB,HW] kgdb over EHCI usb debug port.
>   Format: [,poll interval]
>   The controller # is the number of the ehci usb debug
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e9e1ab5e8006..3d7d10f7187a 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -407,7 +407,7 @@ extern int kimage_crash_copy_vmcoreinfo(struct kimage 
> *image);
>  extern struct kimage *kexec_image;
>  extern struct kimage *kexec_crash_image;
>  
> -bool kexec_load_permited(void);
> +bool kexec_load_permited(bool crash_image);
>  
>  #ifndef kexec_flush_icache_page
>  #define kexec_flush_icache_page(page)
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index d83fc9093aff..2b0856e83fe1 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -193,7 +193,7 @@ static inline int kexec_load_check(unsigned long 
> nr_segments,
>   int result;
>  
>   /* We only trust the superuser with rebooting the system. */
> - if (!kexec_load_permited())
> + if (!kexec_load_permited(flags & KEXEC_ON_CRASH))

nit: permitted.

>   return -EPERM;
>  
>   /* Permit LSMs and IMA to fail the kexec */
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 18bd90ca9c99..7f9d5288b24b 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -952,13 +952,100 @@ static int __init kexec_core_sysctl_init(void)
>  late_initcall(kexec_core_sysctl_init);
>  #endif
>  
> -bool kexec_load_permited(void)
> +struct kexec_load_limit {
> + /* Mutex protects the limit count. */
> + struct mutex mutex;
> + int limit;

Can you not just use atomic ops for limit, and get rid of the mutex?

That will simplify the code as well.

> +};
> +
> +struct kexec_load_limit load_limit_reboot = {
> + .mutex = __MUTEX_INITIALIZER(load_limit_reboot.mutex),
> + .limit = -1,
> +};
> +
> +struct kexec_load_limit load_limit_panic = {
> + .mutex = __MUTEX_INITIALIZER(load_limit_panic.mutex),
> + .limit = -1,
> +};
> +
> +static int param_get_limit(char *buffer, const struct kernel_param *kp)
>  {
> + int ret;
> + struct kexec_load_limit *limit = kp->arg;
> +
> + mutex_lock(>mutex);
> + ret = scnprintf(buffer, PAGE_SIZE, "%i\n", limit->limit);
> + mutex_unlock(>mutex);
> +
> + return ret;
> +}
> +
> +static int param_set_limit(const char *buffer, const struct kernel_param *kp)
> +{
> + int ret;
> + struct kexec_load_limit *limit = kp->arg;
> + int new_val;
> +
> + ret = kstrtoint(buffer, 0, _val);
> + if (ret)
> + return ret;
> +
> + new_val = max(-1, new_val);
> +
> + mutex_lock(>mutex);
> +
> + if (new_val == -1 && limit->limit != -1) {
> + ret = -EINVAL;
> + goto done;
> + }
> +
> + if (limit->limit != -1 && new_val > limit->limit) {
> + ret = -EINVAL;
> + goto done;
> + }
> +
> + limit->limit = new_val;
> +
> +done:
> + mutex_unlock(>mutex);
> +
> + return ret;
> +}
> +
> +static const struct kernel_param_ops load_limit_ops = {
> + .get = param_get_limit,
> + .set = param_set_limit,
> 

Re: [PATCH RFC] kexec: Freeze processes before kexec

2022-11-17 Thread Joel Fernandes
On Thu, Nov 17, 2022 at 3:46 PM Philipp Rudo  wrote:

> On Wed, 16 Nov 2022 15:16:10 -0500
> Steven Rostedt  wrote:
>
> > On Wed, 16 Nov 2022 19:56:24 +
> > "Joel Fernandes (Google)"  wrote:
> >
> > > --- a/kernel/kexec_core.c
> > > +++ b/kernel/kexec_core.c
> > > @@ -1175,6 +1175,12 @@ int kernel_kexec(void)
> > > } else
> > >  #endif
> > > {
> > > +   error = freeze_processes();
> > > +   if (error) {
> > > +   error = -EBUSY;
> > > +   goto Unlock;
> > > +   }
> >
> > If this is the path of a kernel panic, do we really want to check the
> > return error of freeze_processes()? We are panicing, there's not much more
> > we can do.
>
> kernel_kexec isn't called during panic. We don't need to worry about it
> here.

Indeed, sorry for my hasty comment and for misleading Steve. This is
seen to happen when doing manual kexec from the reboot(2) system call.
During a regular panic, machine_shutdown() is not called so this would
not happen. However, for testing, the crash is a hurdle.

> Having that said I think this is a problem in the device driver _not_
> in kexec. In my opinion it's the job of the driver to prevent such
> races during shutdown.

Thanks a lot for your input. Rob, what do you think?

thanks,

 - Joel

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH RFC] kexec: Freeze processes before kexec

2022-11-16 Thread Joel Fernandes
Hey Steve,

On Wed, Nov 16, 2022 at 8:15 PM Steven Rostedt  wrote:
>
> On Wed, 16 Nov 2022 19:56:24 +0000
> "Joel Fernandes (Google)"  wrote:
>
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -1175,6 +1175,12 @@ int kernel_kexec(void)
> >   } else
> >  #endif
> >   {
> > + error = freeze_processes();
> > + if (error) {
> > + error = -EBUSY;
> > + goto Unlock;
> > + }
>
> If this is the path of a kernel panic, do we really want to check the
> return error of freeze_processes()? We are panicing, there's not much more
> we can do.

I am OK with not checking the return of freeze_processes() and trying
to shut down anyway. Will re-spin after any other feedback.

Thanks,

 - Joel

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH RFC] kexec: Freeze processes before kexec

2022-11-16 Thread Joel Fernandes (Google)
During kexec, it is possible for userspace to race with
device_shutdown() causing accesses to GPU after pm_runtime suspend has
already happened. Fix by freezing userspace before device_shutdown().

Such freezing is already being done if kernel supports KEXEC_JUMP and
kexec_image->preserve_context is true. However, doing it if either of
these are not true prevents crashes/races.

This fixes the following crash during kexec reboot on an ARM64 device
with adreno GPU:

[  292.534314] Kernel panic - not syncing: Asynchronous SError Interrupt
[  292.534323] Hardware name: Google Lazor (rev3 - 8) with LTE (DT)
[  292.534326] Call trace:
[  292.534328]  dump_backtrace+0x0/0x1d4
[  292.534337]  show_stack+0x20/0x2c
[  292.534342]  dump_stack_lvl+0x60/0x78
[  292.534347]  dump_stack+0x18/0x38
[  292.534352]  panic+0x148/0x3b0
[  292.534357]  nmi_panic+0x80/0x94
[  292.534364]  arm64_serror_panic+0x70/0x7c
[  292.534369]  do_serror+0x0/0x7c
[  292.534372]  do_serror+0x54/0x7c
[  292.534377]  el1h_64_error_handler+0x34/0x4c
[  292.534381]  el1h_64_error+0x7c/0x80
[  292.534386]  el1_interrupt+0x20/0x58
[  292.534389]  el1h_64_irq_handler+0x18/0x24
[  292.534395]  el1h_64_irq+0x7c/0x80
[  292.534399]  local_daif_inherit+0x10/0x18
[  292.534405]  el1h_64_sync_handler+0x48/0xb4
[  292.534410]  el1h_64_sync+0x7c/0x80
[  292.534414]  a6xx_gmu_set_oob+0xbc/0x1fc
[  292.534422]  a6xx_get_timestamp+0x40/0xb4
[  292.534426]  adreno_get_param+0x12c/0x1e0
[  292.534433]  msm_ioctl_get_param+0x64/0x70
[  292.534440]  drm_ioctl_kernel+0xe8/0x158
[  292.534448]  drm_ioctl+0x208/0x320
[  292.534453]  __arm64_sys_ioctl+0x98/0xd0
[  292.534461]  invoke_syscall+0x4c/0x118
[  292.534467]  el0_svc_common+0x98/0x104
[  292.534473]  do_el0_svc+0x30/0x80
[  292.534478]  el0_svc+0x20/0x50
[  292.534481]  el0t_64_sync_handler+0x78/0x108
[  292.534485]  el0t_64_sync+0x1a4/0x1a8
[  292.534632] Kernel Offset: 0x1a5f80 from 0xffc00800
[  292.534635] PHYS_OFFSET: 0x8000
[  292.534638] CPU features: 0x40018541,a3300e42
[  292.534644] Memory Limit: none

Cc: rost...@goodmis.org
Cc: riba...@google.com
Cc: zwis...@google.com
Cc: robdcl...@gmail.com
Signed-off-by: Joel Fernandes (Google) 
---
 kernel/kexec_core.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index ca2743f9c634..2614a7f1f8a6 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1175,6 +1175,12 @@ int kernel_kexec(void)
} else
 #endif
{
+   error = freeze_processes();
+   if (error) {
+   error = -EBUSY;
+   goto Unlock;
+   }
+
kexec_in_progress = true;
kernel_restart_prepare("kexec reboot");
migrate_to_reboot_cpu();
-- 
2.38.1.584.g0f3c55d4c2-goog


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec