Re: [ewg] Re: Possible process deadlock in RMPP flow

2009-10-20 Thread Tziporet Koren

Sean Hefty wrote:
I can't find anything off in the code for this.  

Eventually it was a FW issue that is fixed in our new 2.7.0 release

Tziporet
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: Possible process deadlock in RMPP flow

2009-10-20 Thread Eli Cohen
On Mon, Oct 19, 2009 at 01:30:47PM -0700, Sean Hefty wrote:
 
 I can't find anything off in the code for this.  It's odd, since
 unregister_mad_agent() does:
 
 flush_workqueue(port_priv-wq);
 ib_cancel_rmpp_recvs(mad_agent_priv);
 
 and ib_cancel_rmpp_recvs() does:
 
 spin_lock_irqsave(agent-lock, flags);
 list_for_each_entry(rmpp_recv, agent-rmpp_list, list) {
 cancel_delayed_work(rmpp_recv-timeout_work);
 cancel_delayed_work(rmpp_recv-cleanup_work);
 }
 spin_unlock_irqrestore(agent-lock, flags);
 
 flush_workqueue(agent-qp_info-port_priv-wq);
 
 which basically just flushes the same work queue.
 
 I haven't been able to reproduce the problem, but I'm running the latest 
 kernel
 - not sure that matters in this case.  Does ibnetdiscover just hang forever at
 the end of the test when this occurs?  Is there any more information 
 available?
 

We are checking if the problem is a firmware bug, it looks like it.
Once we verify this I will send an update. 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] Re: Possible process deadlock in RMPP flow

2009-10-19 Thread Sean Hefty
 Thanks Or. This one is already in OFED 1.4.2 but apparently this is a
 different problem. Once I have information whether the patch Roland
 posted fixed it I will update the list.
 Eli, did you find a commit that fixes the problem you reported on?

 Or.


Not yet :-(

I can't find anything off in the code for this.  It's odd, since
unregister_mad_agent() does:

flush_workqueue(port_priv-wq);
ib_cancel_rmpp_recvs(mad_agent_priv);

and ib_cancel_rmpp_recvs() does:

spin_lock_irqsave(agent-lock, flags);
list_for_each_entry(rmpp_recv, agent-rmpp_list, list) {
cancel_delayed_work(rmpp_recv-timeout_work);
cancel_delayed_work(rmpp_recv-cleanup_work);
}
spin_unlock_irqrestore(agent-lock, flags);

flush_workqueue(agent-qp_info-port_priv-wq);

which basically just flushes the same work queue.

I haven't been able to reproduce the problem, but I'm running the latest kernel
- not sure that matters in this case.  Does ibnetdiscover just hang forever at
the end of the test when this occurs?  Is there any more information available?

- Sean 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: Possible process deadlock in RMPP flow

2009-10-04 Thread Or Gerlitz

Eli Cohen wrote:
Thanks Or. This one is already in OFED 1.4.2 but apparently this is a 
different problem. Once I have information whether the patch Roland 
posted fixed it I will update the list.

Eli, did you find a commit that fixes the problem you reported on?

Or.


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: Possible process deadlock in RMPP flow

2009-10-04 Thread Tziporet Koren

Or Gerlitz wrote:

Eli Cohen wrote:
Thanks Or. This one is already in OFED 1.4.2 but apparently this is a 
different problem. Once I have information whether the patch Roland 
posted fixed it I will update the list.

Eli, did you find a commit that fixes the problem you reported on?

Or.



Not yet :-(
Tziporet
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg