On Thu, Aug 6, 2009 at 7:56 PM, Roland Dreier<[email protected]> wrote: > > > After having applied this patch it took somewhat longer before a > > locking inversion report was generated, but unfortunately there still > > was a locking inversion report generated (see also > > http://bugzilla.kernel.org/show_bug.cgi?id=13757 for the details): > > ummm, yikes... > > can you apply the hack patch I sent originally to take priv->lock from > an interrupt ASAP and try that along with the fix patch to drop > priv->lock before calling ipoib_send()? That might make the lockdep > trace understandable.
The lockdep report I obtained this morning with a 2.6.30.4 kernel and the two patches applied has been attached to the kernel bugzilla entry. This lockdep report was generated while testing the SRPT target software. I have double checked that the SRPT target implementation does not hold any spinlocks or mutexes while calling functions in the IB core. This means that the SRPT target code cannot have caused any of the reported lock cycles. By the way, I noticed that while many subsystems in the Linux kernel use event queues to report information to higher software layers, that the IB core makes extensive use of callback functions. The combination of nested locking and callback functions can easily lead to lock inversion. This effect is well known in the operating system world -- see e.g. the talk by John Ousterhout about multithreaded versus event-driven software (http://home.pacbell.net/ouster/threads.pdf, 1996). ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.30.4-scst-debug #2 --------------------------------------------------------- [ ... ] stack backtrace: Pid: 26040, comm: cc1 Not tainted 2.6.30.4-scst-debug #2 Call Trace: <IRQ> [<ffffffff80272bec>] print_irq_inversion_bug+0x14c/0x1c0 [<ffffffff80272cdd>] check_usage_forwards+0x7d/0xc0 [<ffffffff80271faf>] mark_lock+0x20f/0x6a0 [<ffffffff80272c60>] ? check_usage_forwards+0x0/0xc0 [<ffffffff802743e4>] __lock_acquire+0xce4/0x1c80 [<ffffffff802713bd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff80249305>] ? release_console_sem+0x1e5/0x230 [<ffffffff80249919>] ? vprintk+0x2e9/0x480 [<ffffffff80275488>] lock_acquire+0x108/0x150 [<ffffffffa043f5a2>] ? ib_cm_notify+0x102/0x2c0 [ib_cm] [<ffffffff80515371>] _spin_lock_irqsave+0x41/0x60 [<ffffffffa043f5a2>] ? ib_cm_notify+0x102/0x2c0 [ib_cm] [<ffffffffa043f5a2>] ib_cm_notify+0x102/0x2c0 [ib_cm] [<ffffffffa06a6e1e>] srpt_qp_event+0x4e/0x140 [ib_srpt] [<ffffffffa02656aa>] mlx4_ib_qp_event+0x7a/0xf0 [mlx4_ib] [<ffffffffa04c5e0f>] mlx4_qp_event+0x6f/0xe0 [mlx4_core] [<ffffffffa04bd659>] mlx4_eq_int+0x289/0x2e0 [mlx4_core] [<ffffffffa04bd79a>] mlx4_msi_x_interrupt+0x6a/0x90 [mlx4_core] [<ffffffff8028bf35>] handle_IRQ_event+0x95/0x200 [<ffffffff8028e3d8>] handle_edge_irq+0xc8/0x170 [<ffffffff8020eeef>] handle_irq+0x1f/0x30 [<ffffffff8020e5fe>] do_IRQ+0x6e/0xf0 [<ffffffff8020c913>] ret_from_intr+0x0/0xf <EOI> <6> Bart. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
