On Mar 9, 2012, at 12:59 , Nathan Hjelm wrote:

> Not exactly, the PML invokes the mpool which invokes the registration 
> function. If registration fails the mpool will deregister from its lru (if 
> possible) and try again. So, it is not an error if ibv_reg_mr fails unless it 
> fails because the process is starved of registered memory (or truely run out).
> 
> The hang occurs because there is nothing on the lru to deregister and 
> ibv_reg_mr (or GNI_MemRegister in the uGNI case) fails. The PML then puts the 
> request on its rdma pending list and continues. If any message comes in the 
> rdma pending list is progressed. If not it hangs indefinitely!

Unlike Jeff, I'm not in favor of adding bandages. If the cause is understood, 
then there _is_ a fix, and that fix should be the target of any efforts.

> In general I have found the underlying cause of the hang is due to an 
> imbalance of registrations between processes on a node. i.e the hung process 
> has an empty lru but other processes could deregister. I am working on a new 
> mpool (grdma) to handle the imbalance. The new mpool will allow a process to 
> request that one of its peers deregisters from it lru if possible. I have a 
> working proof of concept implementation that uses a posix shmem segment and a 
> progress function to handle signaling and dereferencing. With it I no longer 
> see hangs with IMB Alltoall/Alltoallv on uGNI (without putting an artificial 
> limit on the number of registrations). I will test the mpool on infiniband 
> later today.

If a solution already exists I don't see why we have to have the message code. 
Based on its urgency, I'm confident your patch will make its way into the 1.5 
quite easily.

  george.

> 
> -Nathan
> 
> On Fri, 9 Mar 2012, Jeffrey Squyres wrote:
> 
>> George --
>> 
>> I believe that this is the subject of a few long-standing tickets (i.e., 
>> what to do when running out of registered memory -- right now, we hang, for 
>> a few reasons).  I think that this is Mellanox's attempt to at least warn 
>> the user that we have run out of registered memory, and will therefore hang.
>> 
>> Once the hangs have been fixed, I'm assuming this message can be removed.
>> 
>> Note, too, that this is in the BTL registration code (openib_reg_mr), not in 
>> the directly-invoked-by-the-PML code.  So it's the mpool's fault -- not the 
>> PML's fault.
>> 
>> 
>> 
>> On Mar 6, 2012, at 10:05 AM, George Bosilca wrote:
>> 
>>> I din't check thoroughly the code, but OMPI_ERR_OUT_OF_RESOURCES is not an 
>>> error. If the registration returns out of resources, the BTL will returns 
>>> OUT_OF_RESOURCE (as an example via the mca_btl_openib_prepare_src). At the 
>>> upper level, the PML (in the mca_pml_ob1_send_request_start function) 
>>> intercept it and insert the request into a pending list. Later on this 
>>> pending list will be examined and the request for resource re-issued.
>>> 
>>> Why do we need to trigger a BTL_ERROR for OUT_OF_RESOURCES?
>>> 
>>>  george.
>>> 
>>> On Mar 6, 2012, at 09:48 , Jeffrey Squyres wrote:
>>> 
>>>> Mike --
>>>> 
>>>> I would make this a bit better of an error.  I.e., use orte_show_help(), 
>>>> so you can explain the issue more, and also remove all duplicates (i.e., 
>>>> if it fails to register multiple times).
>>>> 
>>>> 
>>>> On Mar 6, 2012, at 8:25 AM, mi...@osl.iu.edu wrote:
>>>> 
>>>>> Author: miked
>>>>> Date: 2012-03-06 09:25:56 EST (Tue, 06 Mar 2012)
>>>>> New Revision: 26106
>>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/26106
>>>>> 
>>>>> Log:
>>>>> print error which is ignored on upper layer
>>>>> Text files modified:
>>>>> trunk/ompi/mca/btl/openib/btl_openib_component.c |     2 ++
>>>>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>>>> 
>>>>> Modified: trunk/ompi/mca/btl/openib/btl_openib_component.c
>>>>> ==============================================================================
>>>>> --- trunk/ompi/mca/btl/openib/btl_openib_component.c (original)
>>>>> +++ trunk/ompi/mca/btl/openib/btl_openib_component.c 2012-03-06 09:25:56 
>>>>> EST (Tue, 06 Mar 2012)
>>>>> @@ -569,6 +569,8 @@
>>>>>   openib_reg->mr = ibv_reg_mr(device->ib_pd, base, size, access_flag);
>>>>> 
>>>>>   if (NULL == openib_reg->mr) {
>>>>> +        BTL_ERROR(("%s: error pinning openib memory errno says %s",
>>>>> +                       __func__, strerror(errno)));
>>>>>       return OMPI_ERR_OUT_OF_RESOURCE;
>>>>>   }
>>>>> 
>>>>> _______________________________________________
>>>>> svn-full mailing list
>>>>> svn-f...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
>>>> 
>>>> 
>>>> --
>>>> Jeff Squyres
>>>> jsquy...@cisco.com
>>>> For corporate legal information go to: 
>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>> 
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to