在 2020/12/11 23:46, Rob Herring 写道:
On Fri, Nov 20, 2020 at 03:48:48PM +0800, Qinglang Miao wrote:
When put_device(&bridge->dev) being called, kfree(bridge) is inside
of release function, so the following device_del would cause a
use-after-free bug.

Fixes: 37d6a0a6f470 ("PCI: Add pci_register_host_bridge() interface")

That commit did have some problems, but this patch doesn't apply to that
commit. See commits 1b54ae8327a4 and 9885440b16b8.

Reported-by: Hulk Robot <hul...@huawei.com>
Signed-off-by: Qinglang Miao <miaoqingl...@huawei.com>
---
  drivers/pci/probe.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4289030b0..82292e87e 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -991,8 +991,8 @@ static int pci_register_host_bridge(struct pci_host_bridge 
*bridge)
        return 0;
unregister:
-       put_device(&bridge->dev);
        device_del(&bridge->dev);
+       put_device(&bridge->dev);

I don't think this is right.

Let's look at pci_register_host_bridge() with only the relevant
sections:

static int pci_register_host_bridge(struct pci_host_bridge *bridge)
{
        ...

        err = device_add(&bridge->dev);
        if (err) {
                put_device(&bridge->dev);
                goto free;
        }
        bus->bridge = get_device(&bridge->dev);

         ...
        if (err)
                goto unregister;
        ...

        return 0;

unregister:
        put_device(&bridge->dev);
        device_del(&bridge->dev);

free:
        kfree(bus);
        return err;
}

The documentation for device_add says this:
  * Rule of thumb is: if device_add() succeeds, you should call
  * device_del() when you want to get rid of it. If device_add() has
  * *not* succeeded, use *only* put_device() to drop the reference
  * count.

The put_device at the end is to balance the get_device after device_add.
It will *only* decrement the use count. Then we call device_del as the
documentation says.

Rob
.
Hi, Rob

Your words make sence to me: the code is *logicly* correct here and won't raise a use-after-free bug. I do hold a misunderstanding of this one, sorry for that ~

But I still think this patch should be reconsidered:

The kdoc of device_unregister explicitly mentions the possibility that other refs might continue to exist after device_unregister was called, and *del_device* is first part of it.

By the way, 'del_device() called before put_device()' is everywhere in kernel code, like device_unregister(), pci_destroy_dev() or switchtec_pci_remove()

In fact, I can't find another place in kernel code looks like:
        put_device(x);
        device_del(x);

So I guess put_device() ought to be the last time we touch the object (I don't find evidence strong enough in kdoc to prove this) and putting put_device after device_del is a more natural logic.

Qinglang
.


Reply via email to