Re: Reboot crash at msm_atomic_commit_tail

2021-01-19 Thread Daniel Vetter
On Tue, Jan 19, 2021 at 6:36 PM Fabio Estevam  wrote:
>
> Hi Rob,
>
> On Tue, Jan 19, 2021 at 1:40 PM Rob Clark  wrote:
>
> > > I suppose we should do the drm_atomic_helper_shutdown() conditionally?
>
> This suggestion works, thanks. I will submit a patch shortly.

I think the cleanest way to do this is 2 patches:
- add checks for DRIVER_MODESET to these helpers (there's a function
to check these flags)
- filter out DRIVER_MODESET flag in drm_device.driver_features (this
is used to substract features at load time so that the drm_device
struct can stay const)

There's probably other drivers that'd need the same treatment (but
they don't use these helpers yet).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Reboot crash at msm_atomic_commit_tail

2021-01-19 Thread Fabio Estevam
Hi Rob,

On Tue, Jan 19, 2021 at 1:40 PM Rob Clark  wrote:

> > I suppose we should do the drm_atomic_helper_shutdown() conditionally?

This suggestion works, thanks. I will submit a patch shortly.

Thanks
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Reboot crash at msm_atomic_commit_tail

2021-01-19 Thread Rob Clark
forgot to CC Krishna

On Tue, Jan 19, 2021 at 8:34 AM Rob Clark  wrote:
>
> On Mon, Jan 18, 2021 at 11:00 PM Daniel Vetter  wrote:
> >
> > On Mon, Jan 18, 2021 at 11:00 PM Fabio Estevam  wrote:
> > >
> > > On Mon, Jan 18, 2021 at 6:44 PM Fabio Estevam  wrote:
> > > >
> > > > Adding some more folks in case anyone has any suggestions to fix this
> > > > reboot hang.
> > >
> > > Not sure if this is a valid fix, but the change below makes reboot
> > > works correctly.
> > >
> > > kmscube still works.
> > >
> > > --- a/drivers/gpu/drm/msm/msm_atomic.c
> > > +++ b/drivers/gpu/drm/msm/msm_atomic.c
> > > @@ -207,8 +207,12 @@ void msm_atomic_commit_tail(struct drm_atomic_state 
> > > *state)
> > > struct msm_kms *kms = priv->kms;
> > > struct drm_crtc *async_crtc = NULL;
> > > unsigned crtc_mask = get_crtc_mask(state);
> > > -   bool async = kms->funcs->vsync_time &&
> > > -   can_do_async(state, &async_crtc);
> > > +   bool async;
> > > +
> > > +   if (!kms)
> > > +   return;
> >
> > That looks a bit like a hack papering over the real issue.
> >
> > From your report it sounds like earlier kernels worked, did you
> > attempt bisecting? Also for regressions put regressions into the
> > subject, it's the magic work that gets much more attention.
>
> the root issue is how are we doing KMS stuff on imx (where drm/msm is
> only used for gpu).. which I think is this commit:
>
> --
> commit 9d5cbf5fe46e350715389d89d0c350d83289a102
> Author: Krishna Manikandan 
> AuthorDate: Mon Jun 1 16:33:22 2020 +0530
> Commit: Rob Clark 
> CommitDate: Tue Aug 18 08:09:01 2020 -0700
>
> drm/msm: add shutdown support for display platform_driver
>
> Define shutdown callback for display drm driver,
> so as to disable all the CRTCS when shutdown
> notification is received by the driver.
>
> This change will turn off the timing engine so
> that no display transactions are requested
> while mmu translations are getting disabled
> during reboot sequence.
>
> Signed-off-by: Krishna Manikandan 
>
> Changes in v2:
> - Remove NULL check from msm_pdev_shutdown (Stephen Boyd)
> - Change commit text to reflect when this issue
>   was uncovered (Sai Prakash Ranjan)
>
> Signed-off-by: Rob Clark 
> --
>
> I suppose we should do the drm_atomic_helper_shutdown() conditionally?
>  Or the helper should bail if there is no kms?
>
> BR,
> -R
>
> > -Daniel
> >
> > > +
> > > +   async = kms->funcs->vsync_time && can_do_async(state, 
> > > &async_crtc);
> > >
> > > trace_msm_atomic_commit_tail_start(async, crtc_mask);
> > >
> > > Any comments?
> > >
> > > Thanks
> > > ___
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Reboot crash at msm_atomic_commit_tail

2021-01-19 Thread Rob Clark
On Mon, Jan 18, 2021 at 11:00 PM Daniel Vetter  wrote:
>
> On Mon, Jan 18, 2021 at 11:00 PM Fabio Estevam  wrote:
> >
> > On Mon, Jan 18, 2021 at 6:44 PM Fabio Estevam  wrote:
> > >
> > > Adding some more folks in case anyone has any suggestions to fix this
> > > reboot hang.
> >
> > Not sure if this is a valid fix, but the change below makes reboot
> > works correctly.
> >
> > kmscube still works.
> >
> > --- a/drivers/gpu/drm/msm/msm_atomic.c
> > +++ b/drivers/gpu/drm/msm/msm_atomic.c
> > @@ -207,8 +207,12 @@ void msm_atomic_commit_tail(struct drm_atomic_state 
> > *state)
> > struct msm_kms *kms = priv->kms;
> > struct drm_crtc *async_crtc = NULL;
> > unsigned crtc_mask = get_crtc_mask(state);
> > -   bool async = kms->funcs->vsync_time &&
> > -   can_do_async(state, &async_crtc);
> > +   bool async;
> > +
> > +   if (!kms)
> > +   return;
>
> That looks a bit like a hack papering over the real issue.
>
> From your report it sounds like earlier kernels worked, did you
> attempt bisecting? Also for regressions put regressions into the
> subject, it's the magic work that gets much more attention.

the root issue is how are we doing KMS stuff on imx (where drm/msm is
only used for gpu).. which I think is this commit:

--
commit 9d5cbf5fe46e350715389d89d0c350d83289a102
Author: Krishna Manikandan 
AuthorDate: Mon Jun 1 16:33:22 2020 +0530
Commit: Rob Clark 
CommitDate: Tue Aug 18 08:09:01 2020 -0700

drm/msm: add shutdown support for display platform_driver

Define shutdown callback for display drm driver,
so as to disable all the CRTCS when shutdown
notification is received by the driver.

This change will turn off the timing engine so
that no display transactions are requested
while mmu translations are getting disabled
during reboot sequence.

Signed-off-by: Krishna Manikandan 

Changes in v2:
- Remove NULL check from msm_pdev_shutdown (Stephen Boyd)
- Change commit text to reflect when this issue
  was uncovered (Sai Prakash Ranjan)

Signed-off-by: Rob Clark 
--

I suppose we should do the drm_atomic_helper_shutdown() conditionally?
 Or the helper should bail if there is no kms?

BR,
-R

> -Daniel
>
> > +
> > +   async = kms->funcs->vsync_time && can_do_async(state, &async_crtc);
> >
> > trace_msm_atomic_commit_tail_start(async, crtc_mask);
> >
> > Any comments?
> >
> > Thanks
> > ___
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Reboot crash at msm_atomic_commit_tail

2021-01-18 Thread Daniel Vetter
On Mon, Jan 18, 2021 at 11:00 PM Fabio Estevam  wrote:
>
> On Mon, Jan 18, 2021 at 6:44 PM Fabio Estevam  wrote:
> >
> > Adding some more folks in case anyone has any suggestions to fix this
> > reboot hang.
>
> Not sure if this is a valid fix, but the change below makes reboot
> works correctly.
>
> kmscube still works.
>
> --- a/drivers/gpu/drm/msm/msm_atomic.c
> +++ b/drivers/gpu/drm/msm/msm_atomic.c
> @@ -207,8 +207,12 @@ void msm_atomic_commit_tail(struct drm_atomic_state 
> *state)
> struct msm_kms *kms = priv->kms;
> struct drm_crtc *async_crtc = NULL;
> unsigned crtc_mask = get_crtc_mask(state);
> -   bool async = kms->funcs->vsync_time &&
> -   can_do_async(state, &async_crtc);
> +   bool async;
> +
> +   if (!kms)
> +   return;

That looks a bit like a hack papering over the real issue.

>From your report it sounds like earlier kernels worked, did you
attempt bisecting? Also for regressions put regressions into the
subject, it's the magic work that gets much more attention.
-Daniel

> +
> +   async = kms->funcs->vsync_time && can_do_async(state, &async_crtc);
>
> trace_msm_atomic_commit_tail_start(async, crtc_mask);
>
> Any comments?
>
> Thanks
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Reboot crash at msm_atomic_commit_tail

2021-01-18 Thread Fabio Estevam
On Mon, Jan 18, 2021 at 6:44 PM Fabio Estevam  wrote:
>
> Adding some more folks in case anyone has any suggestions to fix this
> reboot hang.

Not sure if this is a valid fix, but the change below makes reboot
works correctly.

kmscube still works.

--- a/drivers/gpu/drm/msm/msm_atomic.c
+++ b/drivers/gpu/drm/msm/msm_atomic.c
@@ -207,8 +207,12 @@ void msm_atomic_commit_tail(struct drm_atomic_state *state)
struct msm_kms *kms = priv->kms;
struct drm_crtc *async_crtc = NULL;
unsigned crtc_mask = get_crtc_mask(state);
-   bool async = kms->funcs->vsync_time &&
-   can_do_async(state, &async_crtc);
+   bool async;
+
+   if (!kms)
+   return;
+
+   async = kms->funcs->vsync_time && can_do_async(state, &async_crtc);

trace_msm_atomic_commit_tail_start(async, crtc_mask);

Any comments?

Thanks
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Reboot crash at msm_atomic_commit_tail

2021-01-18 Thread Fabio Estevam
Adding some more folks in case anyone has any suggestions to fix this
reboot hang.

Thanks

On Tue, Jan 12, 2021 at 5:07 PM Fabio Estevam  wrote:
>
> Hi,
>
> I have noticed that on an imx53-qsb, it is no longer possible to
> reboot the system as it fails like this:
>
> Requesting system reboot
> [   23.819116] cfg80211: failed to load regulatory.db
> [   23.827569] imx-sdma 63fb.sdma: external firmware not found,
> using ROM firmware
> [   23.956838] ci_hdrc ci_hdrc.0: remove, state 1
> [   23.968029] usb usb1: USB disconnect, device number 1
> [   23.976033] usb 1-1: USB disconnect, device number 2
> [   24.234253] ci_hdrc ci_hdrc.0: USB bus 1 deregistered
> [   24.268964] 8<--- cut here ---
> [   24.274602] Unable to handle kernel NULL pointer dereference at
> virtual address 
> [   24.283434] pgd = (ptrval)
> [   24.286387] [] *pgd=ca212831
> [   24.290788] Internal error: Oops: 17 [#1] SMP ARM
> [   24.295609] Modules linked in:
> [   24.298777] CPU: 0 PID: 197 Comm: init Not tainted
> 5.11.0-rc2-next-20210111 #333
> [   24.306276] Hardware name: Freescale i.MX53 (Device Tree Support)
> [   24.312442] PC is at msm_atomic_commit_tail+0x54/0xb9c
> [   24.317743] LR is at commit_tail+0xa4/0x1b0
> [   24.322032] pc : []lr : []psr: 6013
> [   24.328374] sp : c28d1d50  ip : c23a3000  fp : 
> [   24.333670] r10: c2816780  r9 : c12d71c0  r8 : c17fb018
> [   24.338967] r7 : c23a3000  r6 : c2816780  r5 :   r4 : 
> [   24.345572] r3 : c24c2c00  r2 : c23a3000  r1 : c0769b24  r0 : 
> [   24.352177] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment 
> none
> [   24.359407] Control: 10c5387d  Table: 72858019  DAC: 0051
> [   24.365220] Process init (pid: 197, stack limit = 0x(ptrval))
> [   24.371052] Stack: (0xc28d1d50 to 0xc28d2000)
> [   24.375508] 1d40: 
>  9682f000 0005
> [   24.383794] 1d60: 3031e53d  0dc0 c0f816d8 c23a3000
> c23a3000  c17fb018
> [   24.392079] 1d80: c12d71c0 c2816780  c06db0b4 a5f7faba
> 0005  c2816780
> [   24.400363] 1da0:  c23a3000  c17fb018 c12d71c0
> c24c20a0  c06dbed0
> [   24.408647] 1dc0:   c2816780 c23a349c c2816780
> c28d1dfc c23a34a4 c06db604
> [   24.416932] 1de0: c23a3000  c1609388 c12ba9dc c17fb018
> c06db704 c2965e80 c2965e80
> [   24.425214] 1e00: 0008 0001   c175f454
>  c175f458 c1c669cc
> [   24.433498] 1e20:  c12bebb8  0001 0008
>  c23a32ec c23a32ec
> [   24.441783] 1e40:  433f193b c24c2014 c24c2014 c24c2010
> c17674c8 c1e68bec c07c76e8
> [   24.450067] 1e60:  c16158d8 c1609388 fee1dead 
> c28d 0058 c0153730
> [   24.458350] 1e80: 01234567 c01539d4 fffe  
>   
> [   24.466633] 1ea0:     
>   c1609388
> [   24.474917] 1ec0: c29663c8   c0e17954 e000
>  0001 
> [   24.483200] 1ee0: c1609388 c1609388 c16093d4 433f193b 
> c1581584 e000 1ea51000
> [   24.491485] 1f00: 0001 0080 c1609388 c1609794 c29663c8
>   c0e17954
> [   24.499769] 1f20:    c1609388 c1609388
> c16093d4  c1609388
> [   24.508054] 1f40: c29663c8 c0183f24 c2965e80 c1609388 0001
> c1609794 c2995090 c018ce7c
> [   24.516337] 1f60: 0001 c2995080 c0136e80 c010012c 
> 0001 c158b21c c0e22334
> [   24.524622] 1f80: c158b21c c010019c c1609794 433f193b 
> beefefd4 0001 0058
> [   24.532907] 1fa0: c0100264 c0100080  beefefd4 fee1dead
> 28121969 01234567 
> [   24.541191] 1fc0:  beefefd4 0001 0058 
>  b6f1ef74 
> [   24.549476] 1fe0: 000d7298 beefed40 00091a48 b6e8894c 6010
> fee1dead  
> [   24.557742] [] (msm_atomic_commit_tail) from []
> (commit_tail+0xa4/0x1b0)
> [   24.566349] [] (commit_tail) from []
> (drm_atomic_helper_commit+0x154/0x188)
> [   24.575193] [] (drm_atomic_helper_commit) from
> [] (drm_atomic_helper_disable_all+0x154/0x1c0)
> [   24.585599] [] (drm_atomic_helper_disable_all) from
> [] (drm_atomic_helper_shutdown+0x94/0x12c)
> [   24.596094] [] (drm_atomic_helper_shutdown) from
> [] (device_shutdown+0x118/0x250)
> [   24.605475] [] (device_shutdown) from []
> (kernel_restart+0xc/0x68)
> [   24.613574] [] (kernel_restart) from []
> (__do_sys_reboot+0x144/0x200)
> [   24.621915] [] (__do_sys_reboot) from []
> (ret_fast_syscall+0x0/0x2c)
> [   24.630160] Exception stack(0xc28d1fa8 to 0xc28d1ff0)
> [   24.635315] 1fa0:    beefefd4 fee1dead
> 28121969 01234567 
> [   24.643600] 1fc0:  beefefd4 0001 0058 
>  b6f1ef74 
> [   24.651867] 1fe0: 000d7298 beefed40 00091a48 b6e8894c
> [   24.657025] Code: 1592208c 118552