I'm afraid I honestly don't remember the last time I tested with enable-hetero 
- at least 2-3 weeks ago. I'd suggest starting ~6 months ago and see if that 
still worked.


On Apr 28, 2014, at 7:04 AM, George Bosilca <bosi...@icl.utk.edu> wrote:

> When did you tested last? I have no idea what is broken so it is difficult to 
> assess the complexity of the fix. Let’s try to find the last working 
> “version” and then run a dihcotomic test to  find the culprit (with s 
> hopefully).
> 
>   George.
> 
> 
> On Apr 28, 2014, at 09:05 , Ralph Castain <r...@open-mpi.org> wrote:
> 
>> No, it looks like something has broken it since I last tested. Sorry about 
>> the confusion.
>> 
>> On Apr 27, 2014, at 10:55 PM, Gilles Gouaillardet 
>> <gilles.gouaillar...@iferc.org> wrote:
>> 
>>> I might have misunderstood Jeff's comment :
>>> 
>>>> The broken part(s) is(are) likely somewhere in the datatype and/or PML 
>>>> code (my guess).  Keep in mind that my only testing of this feature is in 
>>>> *homogeneous* mode -- i.e., I compile with --enable-heterogeneous and then 
>>>> run tests on homogeneous machines.  Meaning: it's not only broken for 
>>>> actual heterogeneity, it's also broken in the "unity"/homogeneous case.
>>> 
>>> Unfortunatly, a trivial send/recv can hang in this case 
>>> (--enable-heterogeneous and homogenous cluster of little endian procs).
>>> 
>>> i opened #4568 https://svn.open-mpi.org/trac/ompi/ticket/4568 in order to 
>>> track this issue
>>> (uninitialized data can cause a hang with this config)
>>> 
>>> trunk is affected, v1.8 is very likely affected too
>>> 
>>> Gilles
>>> 
>>> On 2014/04/28 12:22, Ralph Castain wrote:
>>>> I think you misunderstood his comment. It works fine on a homogeneous 
>>>> cluster, even with --enable-hetero. I've run it that way on my cluster.
>>>> 
>>>> On Apr 27, 2014, at 7:50 PM, Gilles Gouaillardet 
>>>> <gilles.gouaillar...@iferc.org> wrote:
>>>> 
>>>>> According to Jeff's comment, OpenMPI compiled with
>>>>> --enable-heterogeneous is broken even in an homogeneous cluster.
>>>>> 
>>>>> as a first step, MTT could be ran with OpenMPI compiled with
>>>>> --enable-heterogenous and running on an homogeneous cluster
>>>>> (ideally on both little and big endian) in order to identify and fix the
>>>>> bug/regression.
>>>>> /* this build is currently disabled in the MTT config of the
>>>>> cisco-community cluster */
>>>>> 
>>>>> Gilles
>>>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/04/14624.php
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/04/14625.php
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/04/14626.php

Reply via email to