Saliya,
there are several things here :
1) which collective module is used ?
2) if the tuned collective module is used, then which algo is used ?
3) which btl is used ?
First, btl is independent of the collective module.
That means that if you do a collective operation, intra node
communications will (likely) use sm or vader btl which is optimized for
shared memory, and openib/tcp/whatever for inter nodes communications.
There is a collective module called coll_sm, and if i understand
correctly, it works only on single node communicators, and avoid using
any btl if possible.
collective modules have different priorities and they do not necessarily
implement all collective operations.
for example, the inter module do not implement barriers on an intra
communicator. conversely, the tuned module do not implement barrier on
an inter communicator.
in most cases (e.g. default config + intra communicator) the tuned
collective module is used.
each operation has several implementation and they are chosen based on
communicator size and message size. this can be overriden by environment
variable and config file as previously described by George.
Last but not least, some collective modules (hierarch, ml, ?) implement
hierarchical collective, which means they should be optimized for multi
node / multi tasks per node.
that being said, ml is not production ready, and i am not sure wheter
hierarch is actively maintained)
i hope this helps
Gilles
On 7/9/2015 5:37 AM, Saliya Ekanayake wrote:
Hi,
I see the same collective operation (say allgatherv) implemented in
different ways under tuned, sm, and inter packages. I read from the
documentation [1] that these get picked up depending on the transport.
Say I run 12 procs per node on 2 nodes totaling 24 procs. If I call
allGatherv collective, will it pick shared memory version to
communicate between procs in the same node and use another for inter
node communication? If so, how can I know/control this?
Also, if I force the algorithm as,
coll_tuned_use_dynamic_rules = 1
coll_tuned_allgatherv_algorithm = 3
will it not get the advantage of shared memory?
[1] https://www.open-mpi.org/faq/?category=sm
Thank you,
Saliya
--
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/07/27265.php