Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Jeff Squyres (jsquyres)
I actually wouldn't advise ml. It *was* being developed as a joint project between ORNL and Mellanox. I think that code eventually grew into what the "hcoll" Mellanox library currently is. As such, ml reflects kind of a middle point before hcoll became hardened into a real product. It has so

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
OK, that's good. I'll try that. So, is *ml* something not being developed now? Any documentation on this component? Thank you, Saliya On Thu, Jun 30, 2016 at 11:01 AM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > you might want to give coll/ml a try > mpirun --mca coll_ml_prior

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Gilles Gouaillardet
you might want to give coll/ml a try mpirun --mca coll_ml_priority 100 ... Cheers, Gilles On Thursday, June 30, 2016, Saliya Ekanayake wrote: > Thank you, Gilles. The reason for digging into intra-node optimizations is > that we've implemented several machine learning applications in OpenMPI >

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
Thank you, Gilles. The reason for digging into intra-node optimizations is that we've implemented several machine learning applications in OpenMPI (Java binding), but found collective communication to be a bottleneck, especially when the number of procs per node is high. I've implemented a shared m

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Gilles Gouaillardet
currently, coll/tuned is not topology aware. this is something interesting, and everyone is invited to contribute. coll/ml is topology aware, but it is kind of unmaintained now. send/recv involves two abstraction layer pml, and then the interconnect transport. typically, pml/ob1 is used, and it us

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
OK, I am beginning to see how it works now. One question I still have is, in the case of a mult-node communicator it seems coll/tuned (or something not coll/sm) well be the one used, so do they do any optimizations to reduce communication within a node? Also where can I find the p2p send recv modu

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Gilles Gouaillardet
the Bcast in coll/sm coll modules have priority (see ompi_info --all) for a given function (e,g. bcast) the module which implements it and has the highest priority is used. note a module can disqualify itself on a given communicator (e.g. coll/sm on I ter node communucator). by default, coll/tune

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
Thank you, Gilles. What is the bcast I should look for? In general, how do I know which module was used to for which communication - can I print this info? On Jun 30, 2016 3:19 AM, "Gilles Gouaillardet" wrote: > 1) is correct. coll/sm is disqualified if the communicator is an inter > communicato

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Gilles Gouaillardet
1) is correct. coll/sm is disqualified if the communicator is an inter communicator or the communicator spans on several nodes. you can have a look at the source code, and you will not that bcast does not use send/recv. instead, it uses a shared memory, so hopefully, it is faster than other mo

[OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
Hi, Looking at the *ompi/mca/coll/sm/coll_sm_module.c* it seems this module will be used only if the calling communicator solely groups processes within a node. I've got two questions here. 1. So is my understanding correct that for something like MPI_COMM_WORLD where world is multiple processes