>-----Original Message-----
>From: Leonid Keller [mailto:[email protected]]
>Sent: Thursday, February 02, 2012 8:42 AM
>To: Hefty, Sean; Tzachi Dar; Smith, Stan
>Cc: Uri Habusha; ofw_list; Irena Gannon
>Subject: RE: opensm stuck upon kill
>
>I do not have the crashed machine more.
>It was rebooted and the full dump creation failed.
>
>I can't say about MADs, but I found only one place where an AV is created and 
>attached to PD - in the send_mad call.
>And I saw that PD has ref_cnt = 227.
>I think these are references of not released AVs i.e. MADs.
>
>Could you tell me where I can see not released MADs ?
>The stuck happened after WmProviderDeregister() and destroy_qp.
>WmProviderDeregister is to release all the queued MADs.
>Could there be some MADs that are already or yet not in the queue ?

Check opensm\user\libvendor\osm_vendor_ibumad.c

>
>-----Original Message-----
>From: Hefty, Sean [mailto:[email protected]]
>Sent: Thursday, February 02, 2012 6:28 PM
>To: Leonid Keller; Tzachi Dar; Smith, Stan
>Cc: Uri Habusha; ofw_list; Irena Gannon
>Subject: RE: opensm stuck upon kill
>
>> winmad!WmRegRemoveHandler+0xae is standing here:
>>
>>      WmProviderDeregister(pRegistration->pProvider, pRegistration);
>>      pRegistration->pDevice->IbInterface.destroy_qp(pRegistration->hQp,
>> NULL);
>>      pRegistration->pDevice->IbInterface.dealloc_pd(pRegistration->hPd,
>> NULL);
>> >    pRegistration->pDevice->IbInterface.close_ca(pRegistration->hCa, NULL);
>>
>> Could you suggest some idea ?
>
>winmad does not explicitly allocate any address handles.  Can you tell if 
>there are any mads which were not returned to the free pool?  You
>could try replacing the NULLs in the above code with ib_sync_destroy (unsure 
>of exact name).
_______________________________________________
ofw mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw

Reply via email to