Re: [ewg] Re: Possible process deadlock in RMPP flow
Sean Hefty wrote: I can't find anything off in the code for this. Eventually it was a FW issue that is fixed in our new 2.7.0 release Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: Possible process deadlock in RMPP flow
On Mon, Oct 19, 2009 at 01:30:47PM -0700, Sean Hefty wrote: I can't find anything off in the code for this. It's odd, since unregister_mad_agent() does: flush_workqueue(port_priv-wq); ib_cancel_rmpp_recvs(mad_agent_priv); and ib_cancel_rmpp_recvs() does: spin_lock_irqsave(agent-lock, flags); list_for_each_entry(rmpp_recv, agent-rmpp_list, list) { cancel_delayed_work(rmpp_recv-timeout_work); cancel_delayed_work(rmpp_recv-cleanup_work); } spin_unlock_irqrestore(agent-lock, flags); flush_workqueue(agent-qp_info-port_priv-wq); which basically just flushes the same work queue. I haven't been able to reproduce the problem, but I'm running the latest kernel - not sure that matters in this case. Does ibnetdiscover just hang forever at the end of the test when this occurs? Is there any more information available? We are checking if the problem is a firmware bug, it looks like it. Once we verify this I will send an update. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ewg] Re: Possible process deadlock in RMPP flow
Thanks Or. This one is already in OFED 1.4.2 but apparently this is a different problem. Once I have information whether the patch Roland posted fixed it I will update the list. Eli, did you find a commit that fixes the problem you reported on? Or. Not yet :-( I can't find anything off in the code for this. It's odd, since unregister_mad_agent() does: flush_workqueue(port_priv-wq); ib_cancel_rmpp_recvs(mad_agent_priv); and ib_cancel_rmpp_recvs() does: spin_lock_irqsave(agent-lock, flags); list_for_each_entry(rmpp_recv, agent-rmpp_list, list) { cancel_delayed_work(rmpp_recv-timeout_work); cancel_delayed_work(rmpp_recv-cleanup_work); } spin_unlock_irqrestore(agent-lock, flags); flush_workqueue(agent-qp_info-port_priv-wq); which basically just flushes the same work queue. I haven't been able to reproduce the problem, but I'm running the latest kernel - not sure that matters in this case. Does ibnetdiscover just hang forever at the end of the test when this occurs? Is there any more information available? - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: Possible process deadlock in RMPP flow
Eli Cohen wrote: Thanks Or. This one is already in OFED 1.4.2 but apparently this is a different problem. Once I have information whether the patch Roland posted fixed it I will update the list. Eli, did you find a commit that fixes the problem you reported on? Or. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: Possible process deadlock in RMPP flow
Or Gerlitz wrote: Eli Cohen wrote: Thanks Or. This one is already in OFED 1.4.2 but apparently this is a different problem. Once I have information whether the patch Roland posted fixed it I will update the list. Eli, did you find a commit that fixes the problem you reported on? Or. Not yet :-( Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg