From: 'Simon Horman' <[email protected]>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
net: mana: Don't overwrite port probe error with add_adev result

This commit ensures that if mana_probe_port() or mana_attach() fails, the
error is not overwritten by the result of add_adev(). By preserving the
error, it triggers mana_remove() cleanup instead of leaving ports partially
initialized.

> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c 
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9d095a6fb56c7..4d42cbfb3f15b 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
>       if (!resuming) {
>               for (i = 0; i < ac->num_ports; i++) {
>                       err = mana_probe_port(ac, i, &ac->ports[i]);
> -                     /* we log the port for which the probe failed and stop
> -                      * probes for subsequent ports.
> -                      * Note that we keep running ports, for which the probes
> -                      * were successful, unless add_adev fails too
> +                     /* Log the port for which the probe failed, stop probing
> +                      * subsequent ports, and skip add_adev.
> +                      * Already-probed ports remain functional.
>                        */

Are these comments accurate regarding ports remaining functional?

Since this patch preserves the error code, execution falls through to the end
of mana_probe() where it checks if err is set and calls mana_remove().
mana_remove() will iterate over the successfully allocated ports, completely
detach them, unregister their net_devices, and destroy event queues.

It appears no ports remain functional because the device is completely torn
down.

[ ... ]

> @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
>                       enable_work(&apc->queue_reset_work);
>                       err = mana_attach(ac->ports[i]);
>                       rtnl_unlock();
> -                     /* we log the port for which the attach failed and stop
> -                      * attach for subsequent ports
> -                      * Note that we keep running ports, for which the attach
> -                      * were successful, unless add_adev fails too
> +                     /* Log the port for which the attach failed, stop
> +                      * attaching subsequent ports, and skip add_adev.
> +                      * Already-attached ports remain functional.
>                        */
>                       if (err) {
>                               dev_err(dev, "Attach Failed for port %d\n", i);
> @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)

Does propagating this error during PM resume introduce a NULL pointer
dereference?

If PM resume fails here (resuming == true), the preserved error code causes
mana_probe() to call mana_remove(), which tears down the device and sets
gd->gdma_context = NULL.

However, a failed resume callback does not automatically unbind the driver.
When the device is eventually unbound later, mana_remove() is invoked again:

drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
    struct gdma_context *gc = gd->gdma_context;
    struct device *dev = gc->dev;
    ...
}

In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
which could lead to a kernel panic when dereferencing gc->dev.

Reply via email to