Marcin Slusarz <marcin.slus...@gmail.com> writes: > On Sun, Jul 11, 2010 at 11:02:12AM +1000, Ben Skeggs wrote: >> On Sun, 2010-07-11 at 01:24 +0200, Marcin Slusarz wrote: >> > Hi >> > >> > Patch "drm/nouveau: use drm_mm in preference to custom code doing the same >> > thing" >> > in nouveau tree introduced new deadlock possibility, for which lockdep >> > complains loudly: >> > >> > (...) >> > >> Hey, >> >> Thanks for the report, I'll look at this more during the week. >> >> > Deadlock scenario looks like this: >> > CPU1 CPU2 >> > nouveau code calls some drm_mm.c >> > function which takes mm->unused_lock >> > >> > nouveau_channel_free disables >> > irqs and takes dev_priv->context_switch_lock >> > calls >> > nv50_graph_destroy_context which >> > (... backtrace >> > above) >> > calls >> > drm_mm_put_block which tries to take mm->unused_lock (spins) >> > nouveau interrupt raises >> > >> > nouveau_irq_handler tries to take >> > dev_priv->context_switch_lock (spins) >> > >> > deadlock >> It's important to note that the drm_mm referenced eventually by >> nv50_graph_destroy_context is per-channel on the card, so for the >> deadlock to happen it'd have to be multiple threads from a single >> process, one thread creating/destroying and object on the channel while >> another was trying to destroy the channel. >>
Yeah, and that situation is impossible ATM because those functions are called with the BKL held. >> > >> > Possible solutions: >> > - reverting "drm/nouveau: use drm_mm in preference to custom code doing >> > the same thing" >> > - disabling interrupts before calling drm_mm functions - unmaintainable >> > and still >> > deadlockable in multicard setups (nouveau and eg radeon) >> Agreed it's unmaintainable, but, as mentioned above, the relevant locks >> can't be touched by radeon. >> >> > - making mm->unused_lock HARDIRQ-safe (patch below) - simple but with >> > slight overhead >> I'll look more during the week, there's other solutions to be explored. > > So, did you find other solution? Some random ideas: - Make context_switch_lock HARDIRQ-unsafe. To avoid racing with the IRQ handler we'd have to disable interrupt dispatch on the card before taking context_switch_lock (i.e. at channel creation and destruction time), and the interrupt control registers would have to be protected with a IRQ safe spinlock. - Split the current destroy_context() hooks in two halves, the first one would be in charge of the PFIFO/PGRAPH-poking tasks (e.g. disable_context()), and the second one would take care of releasing the allocated resources (and it wouldn't need locking). > > Marcin > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
pgpMPxuqUCMXj.pgp
Description: PGP signature
_______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel