From: 'Simon Horman' <[email protected]>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
net: mana: Don't overwrite port probe error with add_adev result
This commit ensures that if mana_probe_port() or mana_attach() fails, the
error is not overwritten by the result of add_adev(). By preserving the
error, it triggers mana_remove() cleanup instead of leaving ports partially
initialized.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9d095a6fb56c7..4d42cbfb3f15b 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> if (!resuming) {
> for (i = 0; i < ac->num_ports; i++) {
> err = mana_probe_port(ac, i, &ac->ports[i]);
> - /* we log the port for which the probe failed and stop
> - * probes for subsequent ports.
> - * Note that we keep running ports, for which the probes
> - * were successful, unless add_adev fails too
> + /* Log the port for which the probe failed, stop probing
> + * subsequent ports, and skip add_adev.
> + * Already-probed ports remain functional.
> */
Are these comments accurate regarding ports remaining functional?
Since this patch preserves the error code, execution falls through to the end
of mana_probe() where it checks if err is set and calls mana_remove().
mana_remove() will iterate over the successfully allocated ports, completely
detach them, unregister their net_devices, and destroy event queues.
It appears no ports remain functional because the device is completely torn
down.
[ ... ]
> @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> enable_work(&apc->queue_reset_work);
> err = mana_attach(ac->ports[i]);
> rtnl_unlock();
> - /* we log the port for which the attach failed and stop
> - * attach for subsequent ports
> - * Note that we keep running ports, for which the attach
> - * were successful, unless add_adev fails too
> + /* Log the port for which the attach failed, stop
> + * attaching subsequent ports, and skip add_adev.
> + * Already-attached ports remain functional.
> */
> if (err) {
> dev_err(dev, "Attach Failed for port %d\n", i);
> @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
Does propagating this error during PM resume introduce a NULL pointer
dereference?
If PM resume fails here (resuming == true), the preserved error code causes
mana_probe() to call mana_remove(), which tears down the device and sets
gd->gdma_context = NULL.
However, a failed resume callback does not automatically unbind the driver.
When the device is eventually unbound later, mana_remove() is invoked again:
drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
struct gdma_context *gc = gd->gdma_context;
struct device *dev = gc->dev;
...
}
In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
which could lead to a kernel panic when dereferencing gc->dev.