RE: [PATCH v4 1/2] dt-bindings: display: bridge: Document RZ/G2L MIPI DSI TX bindings

2022-07-24 Thread Biju Das
Hi Sam,

Thanks for the feedback.

> Subject: Re: [PATCH v4 1/2] dt-bindings: display: bridge: Document
> RZ/G2L MIPI DSI TX bindings
> 
> Hi Biju,
> 
> On Fri, Jul 22, 2022 at 08:19:23PM +0100, Biju Das wrote:
> > The RZ/G2L MIPI DSI TX is embedded in the Renesas RZ/G2L family SoC's.
> > It can operate in DSI mode, with up to four data lanes.
> >
> > Signed-off-by: Biju Das 
> > Reviewed-by: Rob Herring 
> > Reviewed-by: Laurent Pinchart 
> > Reviewed-by: Geert Uytterhoeven 
> > ---
> > v3->v4:
> >  * No change.
> > v2->v3:
> >  * Added Rb tag from Geert and Laurent
> >  * Fixed the typo "Receive" -> "transmit"
> >  * Added accepible values for data-lanes
> >  * Sorted Header file in the example
> >  * Added SoC specific compaible along with generic one.
> > v1->v2:
> >  * Added full path for dsi-controller.yaml
> >  * Modeled DSI + D-PHY as single block and updated reg property
> >  * Fixed typo D_PHY->D-PHY
> >  * Updated description
> >  * Added interrupts and interrupt-names and updated the example
> > RFC->v1:
> >  * Added a ref to dsi-controller.yaml.
> > RFC:-
> >  *
> > ---
> >  .../bindings/display/bridge/renesas,dsi.yaml  | 182
> > ++
> >  1 file changed, 182 insertions(+)
> >  create mode 100644
> > Documentation/devicetree/bindings/display/bridge/renesas,dsi.yaml
> >
> > diff --git
> > a/Documentation/devicetree/bindings/display/bridge/renesas,dsi.yaml
> > b/Documentation/devicetree/bindings/display/bridge/renesas,dsi.yaml
> > new file mode 100644
> > index ..131d5b63ec4f
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/display/bridge/renesas,dsi.yam
> > +++ l
> > @@ -0,0 +1,182 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) %YAML 1.2
> > +---
> > +
> > +title: Renesas RZ/G2L MIPI DSI Encoder
> > +
> > +maintainers:
> > +  - Biju Das 
> > +
> > +description: |
> > +  This binding describes the MIPI DSI encoder embedded in the Renesas
> > +  RZ/G2L alike family of SoC's. The encoder can operate in DSI mode,
> > +with
> > +  up to four data lanes.
> > +
> > +allOf:
> > +  - $ref: /schemas/display/dsi-controller.yaml#
> > +
> > +properties:
> > +  compatible:
> > +items:
> > +  - enum:
> > +  - renesas,r9a07g044-mipi-dsi # RZ/G2{L,LC}
> > +  - const: renesas,rzg2l-mipi-dsi
> > +
> > +  reg:
> > +maxItems: 1
> > +
> > +  interrupts:
> > +items:
> > +  - description: Sequence operation channel 0 interrupt
> > +  - description: Sequence operation channel 1 interrupt
> > +  - description: Video-Input operation channel 1 interrupt
> > +  - description: DSI Packet Receive interrupt
> > +  - description: DSI Fatal Error interrupt
> > +  - description: DSI D-PHY PPI interrupt
> > +  - description: Debug interrupt
> This is an awful lot of interrupts.
> Is this really individual interrupts or status bits in a single
> interrupt? If it is the latter then there should be only one interrupt
> defined.

It is individual interrupts.

As per the hw manual,  these interrupts have dedicated IRQ lines
as below.

dsi_int_seq0 174 SPI 142 IRQ 142 Level
dsi_int_seq1 175 SPI 143 IRQ 143 Level
dsi_int_vin1 176 SPI 144 IRQ 144 Level
dsi_int_rcv 177 SPI 145 IRQ 145 Level
dsi_int_ferr 178 SPI 146 IRQ 146 Level
dsi_int_ppi 179 SPI 147 IRQ 147 Level
dsi_int_debug 180 SPI 148 IRQ 148 Level

Cheers,
Biju


[PATCH] drm/bridge: tc358767: Handle bridge past DPI output

2022-07-24 Thread Marek Vasut
Currently the driver only handles panel directly connected to the DPI output.
Handle the case where a bridge is connected past DPI output of this bridge.
This could be e.g. DPI to LVDS encoder chip.

Signed-off-by: Marek Vasut 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Lucas Stach 
Cc: Maxime Ripard 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: Sam Ravnborg 
---
 drivers/gpu/drm/bridge/tc358767.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358767.c 
b/drivers/gpu/drm/bridge/tc358767.c
index f8c1cd711753b..814ab206fe7ef 100644
--- a/drivers/gpu/drm/bridge/tc358767.c
+++ b/drivers/gpu/drm/bridge/tc358767.c
@@ -1926,22 +1926,23 @@ static int tc_mipi_dsi_host_attach(struct tc_data *tc)
 static int tc_probe_dpi_bridge_endpoint(struct tc_data *tc)
 {
struct device *dev = tc->dev;
+   struct drm_bridge *bridge;
struct drm_panel *panel;
int ret;
 
/* port@1 is the DPI input/output port */
-   ret = drm_of_find_panel_or_bridge(dev->of_node, 1, 0, , NULL);
+   ret = drm_of_find_panel_or_bridge(dev->of_node, 1, 0, , );
if (ret && ret != -ENODEV)
return ret;
 
if (panel) {
-   struct drm_bridge *panel_bridge;
-
-   panel_bridge = devm_drm_panel_bridge_add(dev, panel);
-   if (IS_ERR(panel_bridge))
-   return PTR_ERR(panel_bridge);
+   bridge = devm_drm_panel_bridge_add(dev, panel);
+   if (IS_ERR(bridge))
+   return PTR_ERR(bridge);
+   }
 
-   tc->panel_bridge = panel_bridge;
+   if (bridge) {
+   tc->panel_bridge = bridge;
tc->bridge.type = DRM_MODE_CONNECTOR_DPI;
tc->bridge.funcs = _dpi_bridge_funcs;
 
-- 
2.35.1



Re: [PATCH v2] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Javier Martinez Canillas
On 7/24/22 22:10, Dmitry Baryshkov wrote:
> On Sun, 24 Jul 2022 at 22:51, Javier Martinez Canillas
>  wrote:
>>
>> On 7/24/22 20:47, Javier Martinez Canillas wrote:
>>> Hello Dmitry,
>>
>> [...]
>>
 Now there is no point in having this as a separate function. Could you
>>>
>>> The only reason why I kept this was to avoid duplicating the same comment
>>> in two places. I thought that an inline function would be better than that.
>>>
 please inline it?

>>
>> Or do you mean inline it as dropping the wrapper helper and just call to
>> drm_atomic_helper_shutdown() in both callbacks ? I'm OK with that but as
>> mentioned then we should probably have to duplicate the comment.
>>
>> Since is marked as inline anyways, the resulting code should be the same.
> 
> Yes, I'd like for you to drop the wrapper. I'm fine with duplicating
> the comment, since it will be in place where it matters (before
> checking ddev->registered) rather than just stating the contract for
> the wrapper (which can be easily ignored).
> 
> (And yes, I do read patches and commit messages before commenting.)
> 

OK. I'll post a v3 tomorrow doing that then. Sorry for the misunderstanding.

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH v2] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Dmitry Baryshkov
On Sun, 24 Jul 2022 at 22:51, Javier Martinez Canillas
 wrote:
>
> On 7/24/22 20:47, Javier Martinez Canillas wrote:
> > Hello Dmitry,
>
> [...]
>
> >> Now there is no point in having this as a separate function. Could you
> >
> > The only reason why I kept this was to avoid duplicating the same comment
> > in two places. I thought that an inline function would be better than that.
> >
> >> please inline it?
> >>
>
> Or do you mean inline it as dropping the wrapper helper and just call to
> drm_atomic_helper_shutdown() in both callbacks ? I'm OK with that but as
> mentioned then we should probably have to duplicate the comment.
>
> Since is marked as inline anyways, the resulting code should be the same.

Yes, I'd like for you to drop the wrapper. I'm fine with duplicating
the comment, since it will be in place where it matters (before
checking ddev->registered) rather than just stating the contract for
the wrapper (which can be easily ignored).

(And yes, I do read patches and commit messages before commenting.)

-- 
With best wishes
Dmitry


Re: [PATCH v2] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Javier Martinez Canillas
On 7/24/22 20:47, Javier Martinez Canillas wrote:
> Hello Dmitry,

[...]

>> Now there is no point in having this as a separate function. Could you
> 
> The only reason why I kept this was to avoid duplicating the same comment
> in two places. I thought that an inline function would be better than that.
> 
>> please inline it?
>>

Or do you mean inline it as dropping the wrapper helper and just call to
drm_atomic_helper_shutdown() in both callbacks ? I'm OK with that but as
mentioned then we should probably have to duplicate the comment.

Since is marked as inline anyways, the resulting code should be the same.

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH v2] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Javier Martinez Canillas
Hello Dmitry,

Thanks for your feedback.

On 7/24/22 20:36, Dmitry Baryshkov wrote:
> On Sun, 24 Jul 2022 at 14:13, Javier Martinez Canillas
>  wrote:

[...]

>>
>> +/*
>> + * Shutdown the hw if we're far enough along where things might be on.
>> + * If we run this too early, we'll end up panicking in any variety of
>> + * places. Since we don't register the drm device until late in
>> + * msm_drm_init, drm_dev->registered is used as an indicator that the
>> + * shutdown will be successful.
>> + *
>> + * This function must only be called if drm_dev->registered is true.
>> + */
>> +static inline void msm_shutdown_hw(struct drm_device *dev)
>> +{
>> +   drm_atomic_helper_shutdown(dev);
>> +}
> 
> Now there is no point in having this as a separate function. Could you

The only reason why I kept this was to avoid duplicating the same comment
in two places. I thought that an inline function would be better than that.

> please inline it?
>

That's already the case. Sorry but I have to ask, do you read my patches
before commenting? I have the feeling that is the second time that you ask
for something that was already done in the patch.
 
[...]

>>
>> -   if (!priv || !priv->kms)
>> -   return;
>> -
>> -   drm_atomic_helper_shutdown(drm);
> 
> It might be worth repeating the comment here.
>

As mentioned I thought about it. But then decided that an inline function would
be better to have the comment just in one place. I don't have a strong opinion
though so I could change if others agree that duplicating the comment is better.

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config

2022-07-24 Thread Javier Martinez Canillas
Hello Thomas,

Thanks for your feedback.

On 7/24/22 20:24, Thomas Zimmermann wrote:
> Hi Javier
> 
> Am 24.07.22 um 14:37 schrieb Javier Martinez Canillas:
>> DRM drivers initialize the mode configuration with drmm_mode_config_init()
>> and that function (among other things) initializes mutexes that are later
>> used by modeset helpers.
>>
>> But the helpers should only attempt to grab those locks if the mode config
>> was properly initialized. Otherwise it can lead to kernel oops. An example
>> is when a DRM driver using the component framework does not initialize the
>> drm_mode_config, because its .bind callback was not being executed due one
>> of its expected sub-devices' driver failing to probe.
>>
>> Some drivers check the struct drm_driver.registered field as an indication
>> on whether their .shutdown callback should call helpers to tearn down the
>> mode configuration or not, but most drivers just assume that it is always
>> safe to call helpers such as drm_atomic_helper_shutdown() during shutdown.
>>
>> Let make the DRM core more robust and prevent this to happen, by marking a
>> struct drm_mode_config as initialized during drmm_mode_config_init(). that
>> way helpers can check for it and not attempt to grab uninitialized mutexes.
> 
> I disagree. This patch looks like cargo-cult programming and entirely 
> arbitrary.  The solution here is to fix drivers.  The actual test to 
> perform is to instrument the mutex implementation to detect 
> uninitialized mutexes.
>

While I do agree that drivers should be fixed, IMO we should try to make it
hard for the kernel to crash. We already have checks in other DRM helpers to
avoid accessing uninitialized data, so I don't see why we couldn't do the
same here.

I wrote this patch after fixing a bug in the drm/msm driver [0]. By looking
at how other drivers handled this case, I'm pretty sure that they have the
same problem. A warning is much better than a kernel crash during shutdown.

[0]: 
https://patchwork.kernel.org/project/dri-devel/patch/20220724111327.1195693-1-javi...@redhat.com/
 
> Best regards
> Thomas
> 

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH] drm/msm/dpu: Fix comment typo

2022-07-24 Thread Abhinav Kumar




On 7/24/2022 1:42 PM, Jason Wang wrote:

The double `be' is duplicated in the comment, remove one.

Signed-off-by: Jason Wang 

Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Reviewed-by: Abhinav Kumar 

---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 71fe4c505f5b..38aa38ab1568 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -76,7 +76,7 @@ enum {
  
  /**

   * MDP TOP BLOCK features
- * @DPU_MDP_PANIC_PER_PIPE Panic configuration needs to be be done per pipe
+ * @DPU_MDP_PANIC_PER_PIPE Panic configuration needs to be done per pipe
   * @DPU_MDP_10BIT_SUPPORT, Chipset supports 10 bit pixel formats
   * @DPU_MDP_BWC,   MDSS HW supports Bandwidth compression.
   * @DPU_MDP_UBWC_1_0,  This chipsets supports Universal Bandwidth


Re: [PATCH v2] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Dmitry Baryshkov
On Sun, 24 Jul 2022 at 14:13, Javier Martinez Canillas
 wrote:
>
> Drivers' .remove and .shutdown callbacks are executed on different code
> paths. The former is called when a device is removed from the bus, while
> the latter is called at system shutdown time to quiesce the device.
>
> This means that some overlap exists between the two, because both have to
> take care of properly shutting down the hardware. But currently the logic
> used in these two callbacks isn't consistent in msm drivers, which could
> lead to kernel oops.
>
> For example, on .remove the component is deleted and its .unbind callback
> leads to the hardware being shutdown but only if the DRM device has been
> marked as registered.
>
> That check doesn't exist in the .shutdown logic and this can lead to the
> driver calling drm_atomic_helper_shutdown() for a DRM device that hasn't
> been properly initialized.
>
> A situation like this can happen if drivers for expected sub-devices fail
> to probe, since the .bind callback will never be executed. If that is the
> case, drm_atomic_helper_shutdown() will attempt to take mutexes that are
> only initialized if drm_mode_config_init() is called during a device bind.
>
> This bug was attempted to be fixed in commit 623f279c7781 ("drm/msm: fix
> shutdown hook in case GPU components failed to bind"), but unfortunately
> it still happens in some cases as the one mentioned above, i.e:
>
> [  169.495897] systemd-shutdown[1]: Powering off.
> [  169.500466] kvm: exiting hardware virtualization
> [  169.554787] platform wifi-firmware.0: Removing from iommu group 12
> [  169.610238] platform video-firmware.0: Removing from iommu group 10
> [  169.682164] [ cut here ]
> [  169.686909] WARNING: CPU: 6 PID: 1 at 
> drivers/gpu/drm/drm_modeset_lock.c:317 drm_modeset_lock_all_ctx+0x3c4/0x3d0
> ...
> [  169.775691] Hardware name: Google CoachZ (rev3+) (DT)
> [  169.780874] pstate: a049 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  169.788021] pc : drm_modeset_lock_all_ctx+0x3c4/0x3d0
> [  169.793205] lr : drm_modeset_lock_all_ctx+0x48/0x3d0
> [  169.798299] sp : 8805bb80
> [  169.801701] x29: 8805bb80 x28: 327c00128000 x27: 
> 
> [  169.809025] x26:  x25: 0001 x24: 
> c95d820ec030
> [  169.816349] x23: 327c00bbd090 x22: c95d8215eca0 x21: 
> 327c039c5800
> [  169.823674] x20: 327c039c5988 x19: 8805bbe8 x18: 
> 0034
> [  169.830998] x17: 00040044 x16: c95d80cac920 x15: 
> 
> [  169.838322] x14: 0315 x13: 0315 x12: 
> 
> [  169.845646] x11:  x10:  x9 : 
> 
> [  169.852971] x8 : 8805bc28 x7 :  x6 : 
> 
> [  169.860295] x5 :  x4 :  x3 : 
> 
> [  169.867619] x2 : 327c00128000 x1 :  x0 : 
> 327c039c59b0
> [  169.874944] Call trace:
> [  169.877467]  drm_modeset_lock_all_ctx+0x3c4/0x3d0
> [  169.882297]  drm_atomic_helper_shutdown+0x70/0x134
> [  169.887217]  msm_drv_shutdown+0x30/0x40
> [  169.891159]  platform_shutdown+0x28/0x40
> [  169.895191]  device_shutdown+0x148/0x350
> [  169.899221]  kernel_power_off+0x38/0x80
> [  169.903163]  __do_sys_reboot+0x288/0x2c0
> [  169.907192]  __arm64_sys_reboot+0x28/0x34
> [  169.911309]  invoke_syscall+0x48/0x114
> [  169.915162]  el0_svc_common.constprop.0+0x44/0xec
> [  169.919992]  do_el0_svc+0x2c/0xc0
> [  169.923394]  el0_svc+0x2c/0x84
> [  169.926535]  el0t_64_sync_handler+0x11c/0x150
> [  169.931013]  el0t_64_sync+0x18c/0x190
> [  169.934777] ---[ end trace  ]---
> [  169.939557] Unable to handle kernel NULL pointer dereference at virtual 
> address 0018
> [  169.948574] Mem abort info:
> [  169.951452]   ESR = 0x9604
> [  169.955307]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  169.960765]   SET = 0, FnV = 0
> [  169.963901]   EA = 0, S1PTW = 0
> [  169.967127]   FSC = 0x04: level 0 translation fault
> [  169.972136] Data abort info:
> [  169.975093]   ISV = 0, ISS = 0x0004
> [  169.979037]   CM = 0, WnR = 0
> [  169.982083] user pgtable: 4k pages, 48-bit VAs, pgdp=00010eab1000
> [  169.988697] [0018] pgd=, p4d=
> [  169.995669] Internal error: Oops: 9604 [#1] PREEMPT SMP
> ...
> [  170.079614] Hardware name: Google CoachZ (rev3+) (DT)
> [  170.084801] pstate: a049 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  170.091941] pc : ww_mutex_lock+0x28/0x32c
> [  170.096064] lr : drm_modeset_lock_all_ctx+0x1b0/0x3d0
> [  170.101254] sp : 8805bb50
> [  170.104658] x29: 8805bb50 x28: 327c00128000 x27: 
> 
> [  170.111977] x26:  x25: 0001 x24: 
> 0018
> [  170.119296] x23: 8805bc10 x22: 327c039c5ad8 x21: 
> 

Re: [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config

2022-07-24 Thread Thomas Zimmermann

Hi Javier

Am 24.07.22 um 14:37 schrieb Javier Martinez Canillas:

DRM drivers initialize the mode configuration with drmm_mode_config_init()
and that function (among other things) initializes mutexes that are later
used by modeset helpers.

But the helpers should only attempt to grab those locks if the mode config
was properly initialized. Otherwise it can lead to kernel oops. An example
is when a DRM driver using the component framework does not initialize the
drm_mode_config, because its .bind callback was not being executed due one
of its expected sub-devices' driver failing to probe.

Some drivers check the struct drm_driver.registered field as an indication
on whether their .shutdown callback should call helpers to tearn down the
mode configuration or not, but most drivers just assume that it is always
safe to call helpers such as drm_atomic_helper_shutdown() during shutdown.

Let make the DRM core more robust and prevent this to happen, by marking a
struct drm_mode_config as initialized during drmm_mode_config_init(). that
way helpers can check for it and not attempt to grab uninitialized mutexes.


I disagree. This patch looks like cargo-cult programming and entirely 
arbitrary.  The solution here is to fix drivers.  The actual test to 
perform is to instrument the mutex implementation to detect 
uninitialized mutexes.


Best regards
Thomas



Signed-off-by: Javier Martinez Canillas 
---

  drivers/gpu/drm/drm_mode_config.c  | 4 
  drivers/gpu/drm/drm_modeset_lock.c | 6 ++
  include/drm/drm_mode_config.h  | 8 
  3 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/drm_mode_config.c 
b/drivers/gpu/drm/drm_mode_config.c
index 59b34f07cfce..db649f97120b 100644
--- a/drivers/gpu/drm/drm_mode_config.c
+++ b/drivers/gpu/drm/drm_mode_config.c
@@ -456,6 +456,8 @@ int drmm_mode_config_init(struct drm_device *dev)
dma_resv_fini();
}
  
+	dev->mode_config.initialized = true;

+
return drmm_add_action_or_reset(dev, drm_mode_config_init_release,
NULL);
  }
@@ -549,6 +551,8 @@ void drm_mode_config_cleanup(struct drm_device *dev)
idr_destroy(>mode_config.tile_idr);
idr_destroy(>mode_config.object_idr);
drm_modeset_lock_fini(>mode_config.connection_mutex);
+
+   dev->mode_config.initialized = false;
  }
  EXPORT_SYMBOL(drm_mode_config_cleanup);
  
diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c

index 918065982db4..d6a81cb88123 100644
--- a/drivers/gpu/drm/drm_modeset_lock.c
+++ b/drivers/gpu/drm/drm_modeset_lock.c
@@ -444,6 +444,9 @@ EXPORT_SYMBOL(drm_modeset_unlock);
   *
   * See also: DRM_MODESET_LOCK_ALL_BEGIN() and DRM_MODESET_LOCK_ALL_END()
   *
+ * This function must only be called after drmm_mode_config_init(), since it
+ * takes locks that are initialized as part of the initial mode configuration.
+ *
   * Returns: 0 on success or a negative error-code on failure.
   */
  int drm_modeset_lock_all_ctx(struct drm_device *dev,
@@ -454,6 +457,9 @@ int drm_modeset_lock_all_ctx(struct drm_device *dev,
struct drm_plane *plane;
int ret;
  
+	if (WARN_ON(!dev->mode_config.initialized))

+   return -EINVAL;
+
ret = drm_modeset_lock(>mode_config.connection_mutex, ctx);
if (ret)
return ret;
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 6b5e01295348..d2e1a6d7dcc2 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -954,6 +954,14 @@ struct drm_mode_config {
struct drm_atomic_state *suspend_state;
  
  	const struct drm_mode_config_helper_funcs *helper_private;

+
+   /**
+* @initialized:
+*
+* Internally used by modeset helpers such as drm_modeset_lock_all_ctx()
+* to determine if the mode configuration has been properly initialized.
+*/
+   bool initialized;
  };
  
  int __must_check drmm_mode_config_init(struct drm_device *dev);


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


[Bug 216277] X11 doesn't wait for amdgpu driver to be up

2022-07-24 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216277

--- Comment #2 from dark_syl...@yahoo.com.ar ---
Created attachment 301480
  --> https://bugzilla.kernel.org/attachment.cgi?id=301480=edit
Xorg log when it succeeds

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216277] X11 doesn't wait for amdgpu driver to be up

2022-07-24 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216277

--- Comment #1 from dark_syl...@yahoo.com.ar ---
Created attachment 301479
  --> https://bugzilla.kernel.org/attachment.cgi?id=301479=edit
Xorg log when it fails

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216277] New: X11 doesn't wait for amdgpu driver to be up

2022-07-24 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216277

Bug ID: 216277
   Summary: X11 doesn't wait for amdgpu driver to be up
   Product: Drivers
   Version: 2.5
Kernel Version: 5.18.11+
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: dark_syl...@yahoo.com.ar
Regression: No

# Context

I'm using Xubuntu 20.04
I compiled Kernel 5.18.11+ myself (shows bug)
I compiled Kernel 5.13.7+ myself (does not show bug)
My GPU is AMD Radeon 6800 XT 16GB, I don't have an iGPU (CPU is Ryzen 5900X)

Mesa is:

OpenGL renderer string: AMD Radeon RX 6800 XT (sienna_cichlid, LLVM 14.0.1, DRM
3.46, 5.18.11+)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 22.0.5 - kisak-mesa
PPA
OpenGL core profile shading language version string: 4.60


# Steps to reproduce

1. Turn on the PC
2. On *some* occasions X11 will crash, taking down the keyboard; leaving the
computer in a seemingly frozen state while displaying tty with the last info
messages
3. As a workaround, I can login via ssh and type `sudo service lightdm restart`
and the X11 server will start and everything starts working perfectly fine

# Diagnostic

It seems X11 doesn't wait for amdgpu to be up. This can be seen by checking
/var/log/Xorg.0.log (attached):

[ 7.718] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
[ 7.718] (II) FBDEV: driver for framebuffer: fbdev
[ 7.718] (II) VESA: driver for VESA chipsets: vesa
[ 7.718] (WW) xf86OpenConsole: setpgid failed: Operation not permitted
[ 7.718] (WW) xf86OpenConsole: setsid failed: Operation not permitted
[ 7.719] (EE) open /dev/dri/card0: No such file or directory
[ 7.719] (WW) Falling back to old probe method for modesetting
[ 7.719] (EE) open /dev/dri/card0: No such file or directory

Visually speaking, I *think* that X11 tries to init while tty is still in VESA
mode before/during switching to 1920x1080

AFAIK, systemd is responsible for waiting the GPU drivers are up. Does anybody
know where I should look? Does systemd need an update? Could this be a libDRM
issue? I currently have installed 2.4.110 in /usr/lib and libdrm 2.4.111
compiled from source in /usr/local/lib

I could try bisecting but unfortunately the reproducibility isn't "always"
which makes it hard to debug.

All of this has been working fine with Kernel 5.13.7+

Cheers

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config

2022-07-24 Thread Javier Martinez Canillas
DRM drivers initialize the mode configuration with drmm_mode_config_init()
and that function (among other things) initializes mutexes that are later
used by modeset helpers.

But the helpers should only attempt to grab those locks if the mode config
was properly initialized. Otherwise it can lead to kernel oops. An example
is when a DRM driver using the component framework does not initialize the
drm_mode_config, because its .bind callback was not being executed due one
of its expected sub-devices' driver failing to probe.

Some drivers check the struct drm_driver.registered field as an indication
on whether their .shutdown callback should call helpers to tearn down the
mode configuration or not, but most drivers just assume that it is always
safe to call helpers such as drm_atomic_helper_shutdown() during shutdown.

Let make the DRM core more robust and prevent this to happen, by marking a
struct drm_mode_config as initialized during drmm_mode_config_init(). that
way helpers can check for it and not attempt to grab uninitialized mutexes.

Signed-off-by: Javier Martinez Canillas 
---

 drivers/gpu/drm/drm_mode_config.c  | 4 
 drivers/gpu/drm/drm_modeset_lock.c | 6 ++
 include/drm/drm_mode_config.h  | 8 
 3 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/drm_mode_config.c 
b/drivers/gpu/drm/drm_mode_config.c
index 59b34f07cfce..db649f97120b 100644
--- a/drivers/gpu/drm/drm_mode_config.c
+++ b/drivers/gpu/drm/drm_mode_config.c
@@ -456,6 +456,8 @@ int drmm_mode_config_init(struct drm_device *dev)
dma_resv_fini();
}
 
+   dev->mode_config.initialized = true;
+
return drmm_add_action_or_reset(dev, drm_mode_config_init_release,
NULL);
 }
@@ -549,6 +551,8 @@ void drm_mode_config_cleanup(struct drm_device *dev)
idr_destroy(>mode_config.tile_idr);
idr_destroy(>mode_config.object_idr);
drm_modeset_lock_fini(>mode_config.connection_mutex);
+
+   dev->mode_config.initialized = false;
 }
 EXPORT_SYMBOL(drm_mode_config_cleanup);
 
diff --git a/drivers/gpu/drm/drm_modeset_lock.c 
b/drivers/gpu/drm/drm_modeset_lock.c
index 918065982db4..d6a81cb88123 100644
--- a/drivers/gpu/drm/drm_modeset_lock.c
+++ b/drivers/gpu/drm/drm_modeset_lock.c
@@ -444,6 +444,9 @@ EXPORT_SYMBOL(drm_modeset_unlock);
  *
  * See also: DRM_MODESET_LOCK_ALL_BEGIN() and DRM_MODESET_LOCK_ALL_END()
  *
+ * This function must only be called after drmm_mode_config_init(), since it
+ * takes locks that are initialized as part of the initial mode configuration.
+ *
  * Returns: 0 on success or a negative error-code on failure.
  */
 int drm_modeset_lock_all_ctx(struct drm_device *dev,
@@ -454,6 +457,9 @@ int drm_modeset_lock_all_ctx(struct drm_device *dev,
struct drm_plane *plane;
int ret;
 
+   if (WARN_ON(!dev->mode_config.initialized))
+   return -EINVAL;
+
ret = drm_modeset_lock(>mode_config.connection_mutex, ctx);
if (ret)
return ret;
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 6b5e01295348..d2e1a6d7dcc2 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -954,6 +954,14 @@ struct drm_mode_config {
struct drm_atomic_state *suspend_state;
 
const struct drm_mode_config_helper_funcs *helper_private;
+
+   /**
+* @initialized:
+*
+* Internally used by modeset helpers such as drm_modeset_lock_all_ctx()
+* to determine if the mode configuration has been properly initialized.
+*/
+   bool initialized;
 };
 
 int __must_check drmm_mode_config_init(struct drm_device *dev);
-- 
2.37.1



[PATCH v2] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Javier Martinez Canillas
Drivers' .remove and .shutdown callbacks are executed on different code
paths. The former is called when a device is removed from the bus, while
the latter is called at system shutdown time to quiesce the device.

This means that some overlap exists between the two, because both have to
take care of properly shutting down the hardware. But currently the logic
used in these two callbacks isn't consistent in msm drivers, which could
lead to kernel oops.

For example, on .remove the component is deleted and its .unbind callback
leads to the hardware being shutdown but only if the DRM device has been
marked as registered.

That check doesn't exist in the .shutdown logic and this can lead to the
driver calling drm_atomic_helper_shutdown() for a DRM device that hasn't
been properly initialized.

A situation like this can happen if drivers for expected sub-devices fail
to probe, since the .bind callback will never be executed. If that is the
case, drm_atomic_helper_shutdown() will attempt to take mutexes that are
only initialized if drm_mode_config_init() is called during a device bind.

This bug was attempted to be fixed in commit 623f279c7781 ("drm/msm: fix
shutdown hook in case GPU components failed to bind"), but unfortunately
it still happens in some cases as the one mentioned above, i.e:

[  169.495897] systemd-shutdown[1]: Powering off.
[  169.500466] kvm: exiting hardware virtualization
[  169.554787] platform wifi-firmware.0: Removing from iommu group 12
[  169.610238] platform video-firmware.0: Removing from iommu group 10
[  169.682164] [ cut here ]
[  169.686909] WARNING: CPU: 6 PID: 1 at drivers/gpu/drm/drm_modeset_lock.c:317 
drm_modeset_lock_all_ctx+0x3c4/0x3d0
...
[  169.775691] Hardware name: Google CoachZ (rev3+) (DT)
[  169.780874] pstate: a049 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  169.788021] pc : drm_modeset_lock_all_ctx+0x3c4/0x3d0
[  169.793205] lr : drm_modeset_lock_all_ctx+0x48/0x3d0
[  169.798299] sp : 8805bb80
[  169.801701] x29: 8805bb80 x28: 327c00128000 x27: 
[  169.809025] x26:  x25: 0001 x24: c95d820ec030
[  169.816349] x23: 327c00bbd090 x22: c95d8215eca0 x21: 327c039c5800
[  169.823674] x20: 327c039c5988 x19: 8805bbe8 x18: 0034
[  169.830998] x17: 00040044 x16: c95d80cac920 x15: 
[  169.838322] x14: 0315 x13: 0315 x12: 
[  169.845646] x11:  x10:  x9 : 
[  169.852971] x8 : 8805bc28 x7 :  x6 : 
[  169.860295] x5 :  x4 :  x3 : 
[  169.867619] x2 : 327c00128000 x1 :  x0 : 327c039c59b0
[  169.874944] Call trace:
[  169.877467]  drm_modeset_lock_all_ctx+0x3c4/0x3d0
[  169.882297]  drm_atomic_helper_shutdown+0x70/0x134
[  169.887217]  msm_drv_shutdown+0x30/0x40
[  169.891159]  platform_shutdown+0x28/0x40
[  169.895191]  device_shutdown+0x148/0x350
[  169.899221]  kernel_power_off+0x38/0x80
[  169.903163]  __do_sys_reboot+0x288/0x2c0
[  169.907192]  __arm64_sys_reboot+0x28/0x34
[  169.911309]  invoke_syscall+0x48/0x114
[  169.915162]  el0_svc_common.constprop.0+0x44/0xec
[  169.919992]  do_el0_svc+0x2c/0xc0
[  169.923394]  el0_svc+0x2c/0x84
[  169.926535]  el0t_64_sync_handler+0x11c/0x150
[  169.931013]  el0t_64_sync+0x18c/0x190
[  169.934777] ---[ end trace  ]---
[  169.939557] Unable to handle kernel NULL pointer dereference at virtual 
address 0018
[  169.948574] Mem abort info:
[  169.951452]   ESR = 0x9604
[  169.955307]   EC = 0x25: DABT (current EL), IL = 32 bits
[  169.960765]   SET = 0, FnV = 0
[  169.963901]   EA = 0, S1PTW = 0
[  169.967127]   FSC = 0x04: level 0 translation fault
[  169.972136] Data abort info:
[  169.975093]   ISV = 0, ISS = 0x0004
[  169.979037]   CM = 0, WnR = 0
[  169.982083] user pgtable: 4k pages, 48-bit VAs, pgdp=00010eab1000
[  169.988697] [0018] pgd=, p4d=
[  169.995669] Internal error: Oops: 9604 [#1] PREEMPT SMP
...
[  170.079614] Hardware name: Google CoachZ (rev3+) (DT)
[  170.084801] pstate: a049 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  170.091941] pc : ww_mutex_lock+0x28/0x32c
[  170.096064] lr : drm_modeset_lock_all_ctx+0x1b0/0x3d0
[  170.101254] sp : 8805bb50
[  170.104658] x29: 8805bb50 x28: 327c00128000 x27: 
[  170.111977] x26:  x25: 0001 x24: 0018
[  170.119296] x23: 8805bc10 x22: 327c039c5ad8 x21: 327c039c5800
[  170.126615] x20: 8805bbe8 x19: 0018 x18: 0034
[  170.133933] x17: 00040044 x16: c95d80cac920 x15: 
[  170.141252] x14: 0315 x13: 0315 x12: 
[  170.148571] x11: 

Re: [PATCH 1/4] video: fb: imxfb: Drop platform data support

2022-07-24 Thread Sam Ravnborg
Hi Uwe,

On Sun, Jul 24, 2022 at 11:40:22AM +0200, Sam Ravnborg wrote:
> Hi Uwe,
> 
> On Sun, Jul 24, 2022 at 12:02:18AM +0200, Uwe Kleine-König wrote:
> > Hi Sam,
> > 
> > On Sat, Jul 23, 2022 at 09:23:43PM +0200, Sam Ravnborg wrote:
> > > On Sat, Jul 23, 2022 at 07:57:17PM +0200, Uwe Kleine-König wrote:
> > > > No source file but the driver itself includes the header containing the
> > > > platform data definition. The last user is gone since commit
> > > > 8485adf17a15 ("ARM: imx: Remove imx device directory").
> > > > 
> > > > So we can safely drop platform data support.
> > > > 
> > > > Signed-off-by: Uwe Kleine-König 
> > > 
> > > Do imxfb offer something that is not supported by the DRM drivers?
> > > If yes then the clean-up is good, if not then we could drop the driver?
> > 
> > It's for different i.MX variants. imxfb is for i.MX2x while the DRM
> > drivers in mainline are for i.MX6. (Not sure about the i.MX3 and i.MX5x
> > variants.)
> > 
> > Somewhere in the middle of my todo list is to mainline an i.MX2x DRM
> > driver that could replace the imxfb driver. If I only had a bit more
> > time ...
> 
> There is drm/mxsfb, where Kconfig says:
> "including i.MX23, i.MX28, i.MX6SX, i.MX7 and i.MX8M".
> 
> So there is already something but if this is a 1:1 replacement I dunno.

I suddenly remembered we had the following commit:

f225f1393f034e17281274180626086276da766c ("video: fbdev: mxsfb: Remove driver")

So the fbdev counterpart of drm/mxsfb is already dropped.

Sam


Re: [PATCH 1/4] video: fb: imxfb: Drop platform data support

2022-07-24 Thread Sam Ravnborg
Hi Uwe,

On Sun, Jul 24, 2022 at 12:02:18AM +0200, Uwe Kleine-König wrote:
> Hi Sam,
> 
> On Sat, Jul 23, 2022 at 09:23:43PM +0200, Sam Ravnborg wrote:
> > On Sat, Jul 23, 2022 at 07:57:17PM +0200, Uwe Kleine-König wrote:
> > > No source file but the driver itself includes the header containing the
> > > platform data definition. The last user is gone since commit
> > > 8485adf17a15 ("ARM: imx: Remove imx device directory").
> > > 
> > > So we can safely drop platform data support.
> > > 
> > > Signed-off-by: Uwe Kleine-König 
> > 
> > Do imxfb offer something that is not supported by the DRM drivers?
> > If yes then the clean-up is good, if not then we could drop the driver?
> 
> It's for different i.MX variants. imxfb is for i.MX2x while the DRM
> drivers in mainline are for i.MX6. (Not sure about the i.MX3 and i.MX5x
> variants.)
> 
> Somewhere in the middle of my todo list is to mainline an i.MX2x DRM
> driver that could replace the imxfb driver. If I only had a bit more
> time ...

There is drm/mxsfb, where Kconfig says:
"including i.MX23, i.MX28, i.MX6SX, i.MX7 and i.MX8M".

So there is already something but if this is a 1:1 replacement I dunno.

Sam


Re: [PATCH] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Javier Martinez Canillas
On 7/24/22 11:06, Javier Martinez Canillas wrote:

[...]

> 
> I guess one option is to do the if (dev->registered) check in the callers but
> then it won't really be worth it to have a helper and we could just add that
> check in msm_drv_shutdown() to conditionally call 
> drm_atomic_helper_shutdown().
> 

By the way, the motivation of this patch is to fix a kernel oops during 
poweroff:

[  169.495897] systemd-shutdown[1]: Powering off.   

  
[  169.500466] kvm: exiting hardware virtualization 

  
[  169.554787] platform wifi-firmware.0: Removing from iommu group 12   

  
[  169.610238] platform video-firmware.0: Removing from iommu group 10  

  
[  169.682164] [ cut here ]
[  169.686909] WARNING: CPU: 6 PID: 1 at drivers/gpu/drm/drm_modeset_lock.c:317 
drm_modeset_lock_all_ctx+0x3c4/0x3d0   
[  169.697450] Modules linked in: af_alg rtl8192cu rtl_usb cros_ec_sensors 
rtl8192c_common cros_ec_sensors_core rtlwifi industrialio_triggered_buffer 
venus_enc venus_dec sbs_battery kfifo_buf cros_ec_typec videobuf2_dma_contig 
hid_multito
uch cros_usbpd_logger typec cros_ec_chardev cros_usbpd_charger videobuf2_memops 
ath10k_snoc ath10k_core hci_uart btqca venus_core btbcm ath mac80211 bluetooth 
v4l2_mem2mem videobuf2_v4l2 libarc4 videobuf2_common qcom_spmi_adc_tm5 qrtr cfg
80211 videodev qcom_spmi_adc5 qcom_q6v5_mss ecdh_generic qcom_pil_info ecc 
qcom_vadc_common mc crct10dif_ce qcom_spmi_temp_alarm rfkill qcom_q6v5 
spi_geni_qcom qcom_sysmon qcom_common qmi_helpers snd_soc_max98357a socinfo 
ip6_tables ip_ta
bles x_tables ipmi_devintf ipmi_msghandler fuse zram zsmalloc ipv6  
   
[  169.767126] CPU: 6 PID: 1 Comm: systemd-shutdow Tainted: GW 
5.19.0-rc7+ #20 
[  169.775691] Hardware name: Google CoachZ (rev3+) (DT)
[  169.780874] pstate: a049 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)  
   
[  169.788021] pc : drm_modeset_lock_all_ctx+0x3c4/0x3d0
[  169.793205] lr : drm_modeset_lock_all_ctx+0x48/0x3d0
[  169.798299] sp : 8805bb80
[  169.801701] x29: 8805bb80 x28: 327c00128000 x27: 
   
[  169.809025] x26:  x25: 0001 x24: 
c95d820ec030   
[  169.816349] x23: 327c00bbd090 x22: c95d8215eca0 x21: 
327c039c5800   
[  169.823674] x20: 327c039c5988 x19: 8805bbe8 x18: 
0034   
[  169.830998] x17: 00040044 x16: c95d80cac920 x15: 
   
[  169.838322] x14: 0315 x13: 0315 x12: 
   
[  169.845646] x11:  x10:  x9 : 
   
[  169.852971] x8 : 8805bc28 x7 :  x6 : 
   
[  169.860295] x5 :  x4 :  x3 : 
   
[  169.867619] x2 : 327c00128000 x1 :  x0 : 
327c039c59b0   
[  169.874944] Call trace: 
[  169.877467]  drm_modeset_lock_all_ctx+0x3c4/0x3d0
[  169.882297]  drm_atomic_helper_shutdown+0x70/0x134
[  169.887217]  msm_drv_shutdown+0x30/0x40
[  169.891159]  platform_shutdown+0x28/0x40
[  169.895191]  device_shutdown+0x148/0x350
[  169.899221]  kernel_power_off+0x38/0x80
[  169.903163]  __do_sys_reboot+0x288/0x2c0
[  169.907192]  __arm64_sys_reboot+0x28/0x34
[  169.911309]  invoke_syscall+0x48/0x114
[  169.915162]  el0_svc_common.constprop.0+0x44/0xec
[  169.919992]  do_el0_svc+0x2c/0xc0
[  169.923394]  el0_svc+0x2c/0x84
[  169.926535]  el0t_64_sync_handler+0x11c/0x150
[  169.931013]  el0t_64_sync+0x18c/0x190
[  169.934777] ---[ end trace  ]---
[  169.939557] Unable to handle kernel NULL pointer dereference at virtual 
address 0018
[  169.948574] Mem abort info:  
   
[  

Re: [PATCH] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Javier Martinez Canillas
On 7/24/22 10:53, Dmitry Baryshkov wrote:
> On Sun, 24 Jul 2022 at 00:09, Javier Martinez Canillas

[...]

>> -
>> /*
>>  * Shutdown the hw if we're far enough along where things might be 
>> on.
>>  * If we run this too early, we'll end up panicking in any variety of
>> @@ -205,10 +199,21 @@ static int msm_drm_uninit(struct device *dev)
>>  * msm_drm_init, drm_dev->registered is used as an indicator that the
>>  * shutdown will be successful.
>>  */
>> -   if (ddev->registered) {
>> +   if (dev->registered)
>> +   drm_atomic_helper_shutdown(dev);
>> +}
>> +
>> +static int msm_drm_uninit(struct device *dev)
>> +{
>> +   struct platform_device *pdev = to_platform_device(dev);
>> +   struct msm_drm_private *priv = platform_get_drvdata(pdev);
>> +   struct drm_device *ddev = priv->dev;
>> +   struct msm_kms *kms = priv->kms;
>> +   int i;
>> +
>> +   if (ddev->registered)
>> drm_dev_unregister(ddev);
> 
> No. The drm_dev_unregister() should come before drm_atomic_helper_shutdown().
>

I'm not sure to understand what you meant here, since drm_dev_unregister() is
called before drm_atomic_helper_shutdown() that's called in msm_shutdown_hw().
 
> Also drm_dev_unregister() should not be a part of .shutdown callback.
> See the documentation in the drm_drv.c
>

It is not right now, msm_shutdown_hw() only calls drm_atomic_helper_shutdown()
but drm_dev_unregister() is still called from the msm_drm_uninit() function.
 
Now, your comment made me realize that there's a bug in this patch since after
the drm_dev_unregister(), dev->registered will be set to false and so in the
.remove -> .unbind path drm_atomic_helper_shutdown() will never be executed.

I guess one option is to do the if (dev->registered) check in the callers but
then it won't really be worth it to have a helper and we could just add that
check in msm_drv_shutdown() to conditionally call drm_atomic_helper_shutdown().

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH] drm/msm: Make .remove and .shutdown HW shutdown consistent

2022-07-24 Thread Dmitry Baryshkov
On Sun, 24 Jul 2022 at 00:09, Javier Martinez Canillas
 wrote:
>
> Drivers' .remove and .shutdown callbacks are executed on different code
> paths. The former is called when a device is removed from the bus, while
> the latter is called at system shutdown time to quiesce the device.
>
> This means that some overlap exists between the two, because both have to
> take care of properly shutting down the hardware. But currently the logic
> used in these two callbacks isn't consistent in msm drivers, which could
> lead to kernel oops.
>
> For example, on .remove the component is deleted and its .unbind callback
> leads to the hardware being shutdown but only if the DRM device has been
> marked as registered.
>
> That check doesn't exist in the .shutdown logic and this can lead to the
> driver calling drm_atomic_helper_shutdown() for a DRM device that hasn't
> been properly initialized.
>
> A situation when this can happen is when a driver for an expected device
> fails to probe, since the .bind callback will never be executed.
>
> This bug was attempted to be fixed in commit 623f279c7781 ("drm/msm: fix
> shutdown hook in case GPU components failed to bind"), but unfortunately
> it still happens in some cases.
>
> Rather than trying to keep fixing in both places, let's unify in a single
> helper function that could be used for the two callbacks.
>
> Signed-off-by: Javier Martinez Canillas 
> ---
>
>  drivers/gpu/drm/msm/msm_drv.c | 31 +--
>  1 file changed, 17 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 1ed4cd09dbf8..669891bd6f09 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -190,14 +190,8 @@ static int vblank_ctrl_queue_work(struct msm_drm_private 
> *priv,
> return 0;
>  }
>
> -static int msm_drm_uninit(struct device *dev)
> +static void msm_shutdown_hw(struct drm_device *dev)
>  {
> -   struct platform_device *pdev = to_platform_device(dev);
> -   struct msm_drm_private *priv = platform_get_drvdata(pdev);
> -   struct drm_device *ddev = priv->dev;
> -   struct msm_kms *kms = priv->kms;
> -   int i;
> -
> /*
>  * Shutdown the hw if we're far enough along where things might be on.
>  * If we run this too early, we'll end up panicking in any variety of
> @@ -205,10 +199,21 @@ static int msm_drm_uninit(struct device *dev)
>  * msm_drm_init, drm_dev->registered is used as an indicator that the
>  * shutdown will be successful.
>  */
> -   if (ddev->registered) {
> +   if (dev->registered)
> +   drm_atomic_helper_shutdown(dev);
> +}
> +
> +static int msm_drm_uninit(struct device *dev)
> +{
> +   struct platform_device *pdev = to_platform_device(dev);
> +   struct msm_drm_private *priv = platform_get_drvdata(pdev);
> +   struct drm_device *ddev = priv->dev;
> +   struct msm_kms *kms = priv->kms;
> +   int i;
> +
> +   if (ddev->registered)
> drm_dev_unregister(ddev);

No. The drm_dev_unregister() should come before drm_atomic_helper_shutdown().

Also drm_dev_unregister() should not be a part of .shutdown callback.
See the documentation in the drm_drv.c

> -   drm_atomic_helper_shutdown(ddev);
> -   }
> +   msm_shutdown_hw(ddev);
>
> /* We must cancel and cleanup any pending vblank enable/disable
>  * work before msm_irq_uninstall() to avoid work re-enabling an
> @@ -1242,10 +1247,8 @@ void msm_drv_shutdown(struct platform_device *pdev)
> struct msm_drm_private *priv = platform_get_drvdata(pdev);
> struct drm_device *drm = priv ? priv->dev : NULL;
>
> -   if (!priv || !priv->kms)
> -   return;
> -
> -   drm_atomic_helper_shutdown(drm);
> +   if (drm)
> +   msm_shutdown_hw(drm);
>  }
>
>  static struct platform_driver msm_platform_driver = {
> --
> 2.37.1
>


-- 
With best wishes
Dmitry