Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

2024-02-22 Thread Dmitry Baryshkov
On Sat, 17 Feb 2024 at 17:03, Johan Hovold  wrote:
>
> A recent DRM series purporting to simplify support for "transparent
> bridges" and handling of probe deferrals ironically exposed a
> use-after-free issue on pmic_glink_altmode probe deferral.
>
> This has manifested itself as the display subsystem occasionally failing
> to initialise and NULL-pointer dereferences during boot of machines like
> the Lenovo ThinkPad X13s.
>
> Specifically, the dp-hpd bridge is currently registered before all
> resources have been acquired which means that it can also be
> deregistered on probe deferrals.
>
> In the meantime there is a race window where the new aux bridge driver
> (or PHY driver previously) may have looked up the dp-hpd bridge and
> stored a (non-reference-counted) pointer to the bridge which is about to
> be deallocated.
>
> When the display controller is later initialised, this triggers a
> use-after-free when attaching the bridges:
>
> dp -> aux -> dp-hpd (freed)
>
> which may, for example, result in the freed bridge failing to attach:
>
> [drm:drm_bridge_attach [drm]] *ERROR* failed to attach bridge 
> /soc@0/phy@88eb000 to encoder TMDS-31: -16
>
> or a NULL-pointer dereference:
>
> Unable to handle kernel NULL pointer dereference at virtual address 
> 
> ...
> Call trace:
>   drm_bridge_attach+0x70/0x1a8 [drm]
>   drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
>   drm_bridge_attach+0x80/0x1a8 [drm]
>   dp_bridge_init+0xa8/0x15c [msm]
>   msm_dp_modeset_init+0x28/0xc4 [msm]
>
> The DRM bridge implementation is clearly fragile and implicitly built on
> the assumption that bridges may never go away. In this case, the fix is
> to move the bridge registration in the pmic_glink_altmode driver to
> after all resources have been looked up.
>
> Incidentally, with the new dp-hpd bridge implementation, which registers
> child devices, this is also a requirement due to a long-standing issue
> in driver core that can otherwise lead to a probe deferral loop (see
> fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")).
>
> Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support")
> Fixes: 2bcca96abfbf ("soc: qcom: pmic-glink: switch to DRM_AUX_HPD_BRIDGE")
> Cc: sta...@vger.kernel.org  # 6.3
> Cc: Bjorn Andersson 
> Cc: Dmitry Baryshkov 
> Signed-off-by: Johan Hovold 
> ---
>  drivers/soc/qcom/pmic_glink_altmode.c | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>

Reviewed-by: Dmitry Baryshkov 


-- 
With best wishes
Dmitry


Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

2024-02-21 Thread Bjorn Andersson
On Sat, Feb 17, 2024 at 04:02:25PM +0100, Johan Hovold wrote:
> A recent DRM series purporting to simplify support for "transparent
> bridges" and handling of probe deferrals ironically exposed a
> use-after-free issue on pmic_glink_altmode probe deferral.
> 
> This has manifested itself as the display subsystem occasionally failing
> to initialise and NULL-pointer dereferences during boot of machines like
> the Lenovo ThinkPad X13s.
> 
> Specifically, the dp-hpd bridge is currently registered before all
> resources have been acquired which means that it can also be
> deregistered on probe deferrals.
> 
> In the meantime there is a race window where the new aux bridge driver
> (or PHY driver previously) may have looked up the dp-hpd bridge and
> stored a (non-reference-counted) pointer to the bridge which is about to
> be deallocated.
> 
> When the display controller is later initialised, this triggers a
> use-after-free when attaching the bridges:
> 
>   dp -> aux -> dp-hpd (freed)
> 
> which may, for example, result in the freed bridge failing to attach:
> 
>   [drm:drm_bridge_attach [drm]] *ERROR* failed to attach bridge 
> /soc@0/phy@88eb000 to encoder TMDS-31: -16
> 
> or a NULL-pointer dereference:
> 
>   Unable to handle kernel NULL pointer dereference at virtual address 
> 
>   ...
>   Call trace:
> drm_bridge_attach+0x70/0x1a8 [drm]
> drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
> drm_bridge_attach+0x80/0x1a8 [drm]
> dp_bridge_init+0xa8/0x15c [msm]
> msm_dp_modeset_init+0x28/0xc4 [msm]
> 
> The DRM bridge implementation is clearly fragile and implicitly built on
> the assumption that bridges may never go away. In this case, the fix is
> to move the bridge registration in the pmic_glink_altmode driver to
> after all resources have been looked up.
> 
> Incidentally, with the new dp-hpd bridge implementation, which registers
> child devices, this is also a requirement due to a long-standing issue
> in driver core that can otherwise lead to a probe deferral loop (see
> fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")).
> 
> Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support")
> Fixes: 2bcca96abfbf ("soc: qcom: pmic-glink: switch to DRM_AUX_HPD_BRIDGE")
> Cc: sta...@vger.kernel.org  # 6.3
> Cc: Bjorn Andersson 
> Cc: Dmitry Baryshkov 
> Signed-off-by: Johan Hovold 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/soc/qcom/pmic_glink_altmode.c | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/soc/qcom/pmic_glink_altmode.c 
> b/drivers/soc/qcom/pmic_glink_altmode.c
> index 5fcd0fdd2faa..b3808fc24c69 100644
> --- a/drivers/soc/qcom/pmic_glink_altmode.c
> +++ b/drivers/soc/qcom/pmic_glink_altmode.c
> @@ -76,7 +76,7 @@ struct pmic_glink_altmode_port {
>  
>   struct work_struct work;
>  
> - struct device *bridge;
> + struct auxiliary_device *bridge;
>  
>   enum typec_orientation orientation;
>   u16 svid;
> @@ -230,7 +230,7 @@ static void pmic_glink_altmode_worker(struct work_struct 
> *work)
>   else
>   pmic_glink_altmode_enable_usb(altmode, alt_port);
>  
> - drm_aux_hpd_bridge_notify(alt_port->bridge,
> + drm_aux_hpd_bridge_notify(_port->bridge->dev,
> alt_port->hpd_state ?
> connector_status_connected :
> connector_status_disconnected);
> @@ -454,7 +454,7 @@ static int pmic_glink_altmode_probe(struct 
> auxiliary_device *adev,
>   alt_port->index = port;
>   INIT_WORK(_port->work, pmic_glink_altmode_worker);
>  
> - alt_port->bridge = drm_dp_hpd_bridge_register(dev, 
> to_of_node(fwnode));
> + alt_port->bridge = devm_drm_dp_hpd_bridge_alloc(dev, 
> to_of_node(fwnode));
>   if (IS_ERR(alt_port->bridge)) {
>   fwnode_handle_put(fwnode);
>   return PTR_ERR(alt_port->bridge);
> @@ -510,6 +510,16 @@ static int pmic_glink_altmode_probe(struct 
> auxiliary_device *adev,
>   }
>   }
>  
> + for (port = 0; port < ARRAY_SIZE(altmode->ports); port++) {
> + alt_port = >ports[port];
> + if (!alt_port->bridge)
> + continue;
> +
> + ret = devm_drm_dp_hpd_bridge_add(dev, alt_port->bridge);
> + if (ret)
> + return ret;
> + }
> +
>   altmode->client = devm_pmic_glink_register_client(dev,
> altmode->owner_id,
> 
> pmic_glink_altmode_callback,
> -- 
> 2.43.0
> 


Re: [3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

2024-02-20 Thread Markus Elfring
>> The function call “fwnode_handle_put(fwnode)” is used in multiple if 
>> branches.
>> https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/soc/qcom/pmic_glink_altmode.c#L435
>>
>> I suggest to add a jump target so that a bit of exception handling
>> can be better reused at the end of this function implementation.
>
> Markus, as people have told you repeatedly, just stop with these comments.

How does such a response fit to advices from another known information sources?

Section “7) Centralized exiting of functions”
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?h=v6.8-rc5#n526


> You're not helping, in fact, you are actively harmful to the
> kernel community as you are wasting people's time.

The proposed source code transformation can eventually be (automatically) 
achieved
also with help of improved development tools.

Regards,
Markus


Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

2024-02-20 Thread Johan Hovold
On Tue, Feb 20, 2024 at 11:55:57AM +0100, Markus Elfring wrote:
> …
> > Specifically, the dp-hpd bridge is currently registered before all
> > resources have been acquired which means that it can also be
> > deregistered on probe deferrals.
> >
> > In the meantime there is a race window where the new aux bridge driver
> > (or PHY driver previously) may have looked up the dp-hpd bridge and
> > stored a (non-reference-counted) pointer to the bridge which is about to
> > be deallocated.
> …
> > +++ b/drivers/soc/qcom/pmic_glink_altmode.c
> …
> > @@ -454,7 +454,7 @@ static int pmic_glink_altmode_probe(struct 
> > auxiliary_device *adev,
> > alt_port->index = port;
> > INIT_WORK(_port->work, pmic_glink_altmode_worker);
> >
> > -   alt_port->bridge = drm_dp_hpd_bridge_register(dev, 
> > to_of_node(fwnode));
> > +   alt_port->bridge = devm_drm_dp_hpd_bridge_alloc(dev, 
> > to_of_node(fwnode));
> > if (IS_ERR(alt_port->bridge)) {
> > fwnode_handle_put(fwnode);
> > return PTR_ERR(alt_port->bridge);
> …
> 
> The function call “fwnode_handle_put(fwnode)” is used in multiple if branches.
> https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/soc/qcom/pmic_glink_altmode.c#L435
> 
> I suggest to add a jump target so that a bit of exception handling
> can be better reused at the end of this function implementation.

Markus, as people have told you repeatedly, just stop with these
comments. You're not helping, in fact, you are actively harmful to the
kernel community as you are wasting people's time.

Johan


Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

2024-02-20 Thread Markus Elfring
…
> Specifically, the dp-hpd bridge is currently registered before all
> resources have been acquired which means that it can also be
> deregistered on probe deferrals.
>
> In the meantime there is a race window where the new aux bridge driver
> (or PHY driver previously) may have looked up the dp-hpd bridge and
> stored a (non-reference-counted) pointer to the bridge which is about to
> be deallocated.
…
> +++ b/drivers/soc/qcom/pmic_glink_altmode.c
…
> @@ -454,7 +454,7 @@ static int pmic_glink_altmode_probe(struct 
> auxiliary_device *adev,
>   alt_port->index = port;
>   INIT_WORK(_port->work, pmic_glink_altmode_worker);
>
> - alt_port->bridge = drm_dp_hpd_bridge_register(dev, 
> to_of_node(fwnode));
> + alt_port->bridge = devm_drm_dp_hpd_bridge_alloc(dev, 
> to_of_node(fwnode));
>   if (IS_ERR(alt_port->bridge)) {
>   fwnode_handle_put(fwnode);
>   return PTR_ERR(alt_port->bridge);
…

The function call “fwnode_handle_put(fwnode)” is used in multiple if branches.
https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/soc/qcom/pmic_glink_altmode.c#L435

I suggest to add a jump target so that a bit of exception handling
can be better reused at the end of this function implementation.

Regards,
Markus


Re: [PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

2024-02-20 Thread Markus Elfring
…
> Specifically, the dp-hpd bridge is currently registered before all
> resources have been acquired which means that it can also be
> deregistered on probe deferrals.
>
> In the meantime there is a race window where the new aux bridge driver
> (or PHY driver previously) may have looked up the dp-hpd bridge and
> stored a (non-reference-counted) pointer to the bridge which is about to
> be deallocated.
…

I got the impression that the change description can be improved another bit.

1. Will any additional imperative wordings become helpful?
   
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.8-rc5#n94


…
> +++ b/drivers/soc/qcom/pmic_glink_altmode.c
> @@ -76,7 +76,7 @@ struct pmic_glink_altmode_port {
>
>   struct work_struct work;
>
> - struct device *bridge;
> + struct auxiliary_device *bridge;
>
>   enum typec_orientation orientation;
>   u16 svid;
…

2. How do you think about to stress such a data type adjustment?

Regards,
Markus


[PATCH 3/6] soc: qcom: pmic_glink_altmode: fix drm bridge use-after-free

2024-02-17 Thread Johan Hovold
A recent DRM series purporting to simplify support for "transparent
bridges" and handling of probe deferrals ironically exposed a
use-after-free issue on pmic_glink_altmode probe deferral.

This has manifested itself as the display subsystem occasionally failing
to initialise and NULL-pointer dereferences during boot of machines like
the Lenovo ThinkPad X13s.

Specifically, the dp-hpd bridge is currently registered before all
resources have been acquired which means that it can also be
deregistered on probe deferrals.

In the meantime there is a race window where the new aux bridge driver
(or PHY driver previously) may have looked up the dp-hpd bridge and
stored a (non-reference-counted) pointer to the bridge which is about to
be deallocated.

When the display controller is later initialised, this triggers a
use-after-free when attaching the bridges:

dp -> aux -> dp-hpd (freed)

which may, for example, result in the freed bridge failing to attach:

[drm:drm_bridge_attach [drm]] *ERROR* failed to attach bridge 
/soc@0/phy@88eb000 to encoder TMDS-31: -16

or a NULL-pointer dereference:

Unable to handle kernel NULL pointer dereference at virtual address 

...
Call trace:
  drm_bridge_attach+0x70/0x1a8 [drm]
  drm_aux_bridge_attach+0x24/0x38 [aux_bridge]
  drm_bridge_attach+0x80/0x1a8 [drm]
  dp_bridge_init+0xa8/0x15c [msm]
  msm_dp_modeset_init+0x28/0xc4 [msm]

The DRM bridge implementation is clearly fragile and implicitly built on
the assumption that bridges may never go away. In this case, the fix is
to move the bridge registration in the pmic_glink_altmode driver to
after all resources have been looked up.

Incidentally, with the new dp-hpd bridge implementation, which registers
child devices, this is also a requirement due to a long-standing issue
in driver core that can otherwise lead to a probe deferral loop (see
fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")).

Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support")
Fixes: 2bcca96abfbf ("soc: qcom: pmic-glink: switch to DRM_AUX_HPD_BRIDGE")
Cc: sta...@vger.kernel.org  # 6.3
Cc: Bjorn Andersson 
Cc: Dmitry Baryshkov 
Signed-off-by: Johan Hovold 
---
 drivers/soc/qcom/pmic_glink_altmode.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/soc/qcom/pmic_glink_altmode.c 
b/drivers/soc/qcom/pmic_glink_altmode.c
index 5fcd0fdd2faa..b3808fc24c69 100644
--- a/drivers/soc/qcom/pmic_glink_altmode.c
+++ b/drivers/soc/qcom/pmic_glink_altmode.c
@@ -76,7 +76,7 @@ struct pmic_glink_altmode_port {
 
struct work_struct work;
 
-   struct device *bridge;
+   struct auxiliary_device *bridge;
 
enum typec_orientation orientation;
u16 svid;
@@ -230,7 +230,7 @@ static void pmic_glink_altmode_worker(struct work_struct 
*work)
else
pmic_glink_altmode_enable_usb(altmode, alt_port);
 
-   drm_aux_hpd_bridge_notify(alt_port->bridge,
+   drm_aux_hpd_bridge_notify(_port->bridge->dev,
  alt_port->hpd_state ?
  connector_status_connected :
  connector_status_disconnected);
@@ -454,7 +454,7 @@ static int pmic_glink_altmode_probe(struct auxiliary_device 
*adev,
alt_port->index = port;
INIT_WORK(_port->work, pmic_glink_altmode_worker);
 
-   alt_port->bridge = drm_dp_hpd_bridge_register(dev, 
to_of_node(fwnode));
+   alt_port->bridge = devm_drm_dp_hpd_bridge_alloc(dev, 
to_of_node(fwnode));
if (IS_ERR(alt_port->bridge)) {
fwnode_handle_put(fwnode);
return PTR_ERR(alt_port->bridge);
@@ -510,6 +510,16 @@ static int pmic_glink_altmode_probe(struct 
auxiliary_device *adev,
}
}
 
+   for (port = 0; port < ARRAY_SIZE(altmode->ports); port++) {
+   alt_port = >ports[port];
+   if (!alt_port->bridge)
+   continue;
+
+   ret = devm_drm_dp_hpd_bridge_add(dev, alt_port->bridge);
+   if (ret)
+   return ret;
+   }
+
altmode->client = devm_pmic_glink_register_client(dev,
  altmode->owner_id,
  
pmic_glink_altmode_callback,
-- 
2.43.0