Saliya,

there are several things here :
1) which collective module is used ?
2) if the tuned collective module is used, then which algo is used ?
3) which btl is used ?

First, btl is independent of the collective module.
That means that if you do a collective operation, intra node communications will (likely) use sm or vader btl which is optimized for shared memory, and openib/tcp/whatever for inter nodes communications.

There is a collective module called coll_sm, and if i understand correctly, it works only on single node communicators, and avoid using any btl if possible.

collective modules have different priorities and they do not necessarily implement all collective operations. for example, the inter module do not implement barriers on an intra communicator. conversely, the tuned module do not implement barrier on an inter communicator.

in most cases (e.g. default config + intra communicator) the tuned collective module is used. each operation has several implementation and they are chosen based on communicator size and message size. this can be overriden by environment variable and config file as previously described by George.

Last but not least, some collective modules (hierarch, ml, ?) implement hierarchical collective, which means they should be optimized for multi node / multi tasks per node. that being said, ml is not production ready, and i am not sure wheter hierarch is actively maintained)

i hope this helps

Gilles

On 7/9/2015 5:37 AM, Saliya Ekanayake wrote:
Hi,

I see the same collective operation (say allgatherv) implemented in different ways under tuned, sm, and inter packages. I read from the documentation [1] that these get picked up depending on the transport.

Say I run 12 procs per node on 2 nodes totaling 24 procs. If I call allGatherv collective, will it pick shared memory version to communicate between procs in the same node and use another for inter node communication? If so, how can I know/control this?

Also, if I force the algorithm as,

coll_tuned_use_dynamic_rules = 1
coll_tuned_allgatherv_algorithm = 3

will it not get the advantage of shared memory?

[1] https://www.open-mpi.org/faq/?category=sm

Thank you,
Saliya

--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org


_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/07/27265.php

Reply via email to