On Tue, Nov 03, 2020 at 12:25:22PM +0100, Maxime Ripard wrote: > Hi! > > On Tue, Nov 03, 2020 at 11:55:08AM +0100, Daniel Vetter wrote: > > On Tue, Nov 03, 2020 at 11:38:32AM +0100, Maxime Ripard wrote: > > > On Tue, Nov 03, 2020 at 11:10:27AM +0100, Thomas Zimmermann wrote: > > > > Hi > > > > > > > > Am 03.11.20 um 10:52 schrieb Maxime Ripard: > > > > > On Tue, Nov 03, 2020 at 10:10:41AM +0800, Tian Tao wrote: > > > > >> Add new api devm_drm_irq_install() to register interrupts, > > > > >> no need to call drm_irq_uninstall() when the drm module is removed. > > > > >> > > > > >> v2: > > > > >> fixed the wrong parameter. > > > > >> > > > > >> Signed-off-by: Tian Tao <tiant...@hisilicon.com> > > > > >> --- > > > > >> drivers/gpu/drm/drm_drv.c | 23 +++++++++++++++++++++++ > > > > >> include/drm/drm_drv.h | 3 ++- > > > > >> 2 files changed, 25 insertions(+), 1 deletion(-) > > > > >> > > > > >> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > > > > >> index cd162d4..0fe5243 100644 > > > > >> --- a/drivers/gpu/drm/drm_drv.c > > > > >> +++ b/drivers/gpu/drm/drm_drv.c > > > > >> @@ -39,6 +39,7 @@ > > > > >> #include <drm/drm_color_mgmt.h> > > > > >> #include <drm/drm_drv.h> > > > > >> #include <drm/drm_file.h> > > > > >> +#include <drm/drm_irq.h> > > > > >> #include <drm/drm_managed.h> > > > > >> #include <drm/drm_mode_object.h> > > > > >> #include <drm/drm_print.h> > > > > >> @@ -678,6 +679,28 @@ static int devm_drm_dev_init(struct device > > > > >> *parent, > > > > >> return ret; > > > > >> } > > > > >> > > > > >> +static void devm_drm_dev_irq_uninstall(void *data) > > > > >> +{ > > > > >> + drm_irq_uninstall(data); > > > > >> +} > > > > >> + > > > > >> +int devm_drm_irq_install(struct device *parent, > > > > >> + struct drm_device *dev, int irq) > > > > >> +{ > > > > >> + int ret; > > > > >> + > > > > >> + ret = drm_irq_install(dev, irq); > > > > >> + if (ret) > > > > >> + return ret; > > > > >> + > > > > >> + ret = devm_add_action(parent, devm_drm_dev_irq_uninstall, dev); > > > > >> + if (ret) > > > > >> + devm_drm_dev_irq_uninstall(dev); > > > > >> + > > > > >> + return ret; > > > > >> +} > > > > >> +EXPORT_SYMBOL(devm_drm_irq_install); > > > > >> + > > > > > > > > > > Shouldn't we tie the IRQ to the drm device (so with drmm_add_action) > > > > > instead of tying it to the underlying device? > > > > > > > > If the HW device goes away, there won't be any more interrupts. So it's > > > > similar to devm_ functions for I/O memory. Why would you use the drmm_ > > > > interface? > > > > > > drm_irq_install/uninstall do more that just calling request_irq and > > > free_irq though, they will also run (among other things) the irq-related > > > hooks in the drm driver (irq_preinstall, irq_postinstall irq_uninstall) > > > and wake anything waiting for a vblank to occur, so we need the DRM > > > device and driver to still be around when we run drm_irq_uninstall. > > > That's why (I assume) you have to pass the drm_device as an argument and > > > not simply the device. > > > > drm_device is guaranteed to outlive devm_, plus the hooks are meant to > > shut down hw. hw isn't around anymore when we do drmm_ cleanup, at least > > not in full generality. > > drm_dev_put is either called through devm or in remove / unbind, and the > drm_device takes a reference on its parent device, so how can the > drm_device outlive its parent device?
Oh there's more than just that going on. struct device has 2 lifetime things: - devres resources: These are release on a) on driver unbind b) driver bind failure. Which means if you hotunplug, then devres is gone - the kmalloced piece of memory containing the struct device, refcounted with kref. Totally independent. So hw resources like irq should be managed with devres. Memory allocations (to prevent use-after-free) should be refcounted by a kref somewhere. In the case of struct device that's done by the driver core. In the case of struct drm_device and all the stuff hanging off that, it's done by drmm_ (ideally at least, since in practice all drivers except i915 get it wrong without drmm_). Managing memory allocations with devres is almost always a bug. So when you unbind/hotunplug a device, the following happens: - driver unbind gets called and finishes - devres cleans up hw resources - as one of the last devres action the drm_dev_put gets called - (if no userspace is around or anything else that holds a drm_device reference) drmm_ cleans up drm_device resources - as the last cleanup drmm_ calls put_device() - the actual struct device gets released > > > This probably works in most case since you would allocate the drm_device > > > with devm_drm_dev_alloc, and then run drm_irq_install, so in the undoing > > > phase you would have first drm_irq_uninstall to run, and everything is > > > fine. > > > > > > However, if one doesn't use devm_drm_dev_alloc but would use > > > devm_drm_irq_install, you would have first remove being called that > > > would free the drm device, and then drm_irq_uninstall which will use a > > > free'd pointer. > > > > Don't do that, it's broken :-) > > Well, yeah it's usually a pretty bad situation, but if we can fix it for > free it doesn't hurt? See my comment somewhere, if the devm_drm_irq_install also holds a drm_dev_get reference, then no matter which wrong way you set stuff up, the right thing should happen. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch