Hi, coll/hcoll is Mellanox driven collective package. coll/ml is managed/supported/developed by ORNL folks.
On Tue, Mar 4, 2014 at 1:06 PM, Ralph Castain <r...@open-mpi.org> wrote: > Ummm...the "ml" stands for Mellanox. This is a component you folks > contributed at some time. IIRC, the hcoll and/or bcol are meant to replace > it, but you folks would know best what to do with it. > > > > On Tue, Mar 4, 2014 at 12:12 AM, Elena Elkina <elena.elk...@itseez.com>wrote: > >> Hi, >> >> Recently I often meet hangs and seg faults with different command lines >> and there are "ml" functions in the stack trace. >> When I just turn "ml" off by do -mca coll ^ml, problems disappear. >> For example, >> oshrun -np 4 --map-by node --display-map ./ring_oshmem >> fails with seg fault while >> oshrun -np 4 --map-by node --display-map -mca coll ^ml ./ring_oshmem >> passes. >> >> The "ml" priority is low (27), but it could have issues during comm_query >> (it does all initialization staff there). >> >> "Ml" is unreliable component. So It may be reasonable do not to build >> this component by default to avoid such problems. >> >> What do you think? >> >> Best regards, >> Elena >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Searchable archives: >> http://www.open-mpi.org/community/lists/devel/2014/03/date.php >> > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Searchable archives: > http://www.open-mpi.org/community/lists/devel/2014/03/date.php >