Re: mlx4 catas_reset hangs when using the CM

2012-04-02 Thread sebastien dugue
On Fri, 30 Mar 2012 11:33:56 -0700 Roland Dreier wrote: > On Thu, Mar 29, 2012 at 4:41 AM, sebastien dugue > wrote: > >  So it looks like that cma_process_remove() did all it's job cleaning up > > but is hung waiting for the client refcount to reach 0, which never happens. > > This is unfortuna

Re: mlx4 catas_reset hangs when using the CM

2012-03-30 Thread Roland Dreier
On Thu, Mar 29, 2012 at 4:41 AM, sebastien dugue wrote: >  So it looks like that cma_process_remove() did all it's job cleaning up > but is hung waiting for the client refcount to reach 0, which never happens. This is unfortunately expected with the current implementation. Because we don't have

mlx4 catas_reset hangs when using the CM

2012-03-29 Thread sebastien dugue
Hi, when the mlx4 FW generate an internal error, the driver's catas code tries to reset the HCA and restart the stack. However if the CM is in use at that moment, the stack shutdown never completes and hangs in the CM waiting for a refcount that never reaches 0. I've not much knowledge of