I'm afraid I honestly don't remember the last time I tested with enable-hetero - at least 2-3 weeks ago. I'd suggest starting ~6 months ago and see if that still worked.
On Apr 28, 2014, at 7:04 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > When did you tested last? I have no idea what is broken so it is difficult to > assess the complexity of the fix. Let’s try to find the last working > “version” and then run a dihcotomic test to find the culprit (with s > hopefully). > > George. > > > On Apr 28, 2014, at 09:05 , Ralph Castain <r...@open-mpi.org> wrote: > >> No, it looks like something has broken it since I last tested. Sorry about >> the confusion. >> >> On Apr 27, 2014, at 10:55 PM, Gilles Gouaillardet >> <gilles.gouaillar...@iferc.org> wrote: >> >>> I might have misunderstood Jeff's comment : >>> >>>> The broken part(s) is(are) likely somewhere in the datatype and/or PML >>>> code (my guess). Keep in mind that my only testing of this feature is in >>>> *homogeneous* mode -- i.e., I compile with --enable-heterogeneous and then >>>> run tests on homogeneous machines. Meaning: it's not only broken for >>>> actual heterogeneity, it's also broken in the "unity"/homogeneous case. >>> >>> Unfortunatly, a trivial send/recv can hang in this case >>> (--enable-heterogeneous and homogenous cluster of little endian procs). >>> >>> i opened #4568 https://svn.open-mpi.org/trac/ompi/ticket/4568 in order to >>> track this issue >>> (uninitialized data can cause a hang with this config) >>> >>> trunk is affected, v1.8 is very likely affected too >>> >>> Gilles >>> >>> On 2014/04/28 12:22, Ralph Castain wrote: >>>> I think you misunderstood his comment. It works fine on a homogeneous >>>> cluster, even with --enable-hetero. I've run it that way on my cluster. >>>> >>>> On Apr 27, 2014, at 7:50 PM, Gilles Gouaillardet >>>> <gilles.gouaillar...@iferc.org> wrote: >>>> >>>>> According to Jeff's comment, OpenMPI compiled with >>>>> --enable-heterogeneous is broken even in an homogeneous cluster. >>>>> >>>>> as a first step, MTT could be ran with OpenMPI compiled with >>>>> --enable-heterogenous and running on an homogeneous cluster >>>>> (ideally on both little and big endian) in order to identify and fix the >>>>> bug/regression. >>>>> /* this build is currently disabled in the MTT config of the >>>>> cisco-community cluster */ >>>>> >>>>> Gilles >>>>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/04/14624.php >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/04/14625.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/04/14626.php