Hi Khalid,
i checked the source code and it turns out rules must be ordered :
- first by communicator size
- second by message size
Here is attached an updated version of the ompi_tuned_file.conf you
should use
Cheers,
Gilles
On 5/20/2015 8:39 AM, Khalid Hasanov wrote:
Hello,
I am trying to use coll_tuned_dynamic_rules_filename option.
I am not sure if I do everything right or not. But my impression is
that config file feature does not work as expected.
For example, if I specify config file as in the attached
ompi_tuned_file.conf and execute the attached simple broadcast example
as :
mpirun -n 16 --mca coll_tuned_use_dynamic_rules 1 --mca
coll_tuned_dynamic_rules_filename ompi_tuned_file.conf -mca
coll_base_verbose 1 bcast_example
<https://mail.google.com/mail/u/0/?ui=2&ik=e63390c27f&view=att&th=14d6e6bef2c6fbca&attid=0.2&disp=safe&realattid=f_i9vxd25k1&zw>
I would expect that during run time the config file should be
ignored as it does not contain any configuration for communicator
size 16. However, it uses configuration for the last communicator
for which the size is 5. I have attached tuned_output file for
more information.
Similar problem exists even if the configuration file contains
config for communicator size 16. For example , I added to the
configuration file first communicator size 16 then communicator
size 5. But it used configuration for communicator size 5.
Another interesting thing is that if the second communicator size
is greater than the first communicator in the config file then it
seems to work correctly. At least I tested it for the case where
communicator one had size 16 and second had 55.
I used a development version of Open MPI (1.9.0a1). I forked it
into my own github (https://github.com/khalid-hasanov/ompi) and I
have attached ompi_info outputs as well.
I have added some printfs into coll_tuned_decision_dynamic.c file
to double check it:
if (alg) {
printf("Men burdayam: alg=%d\n", alg);
/* we have found a valid choice from the file based rules for this
message size */
return ompi_coll_tuned_bcast_intra_do_this (buff, count, datatype,
root,
comm, module,
alg, faninout, segsize);
} /* found a method */
Best regards,
Khalid
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/05/26882.php
1 # num of collectives
7 # ID = 7 Bcast collective (ID in coll_base_functions.h)
2 # number of com sizes 2
5 # comm size 8
7 # number of msg sizes 7
0 1 0 0 # for message size 0, linear 1, topo 0, segmentation 0
1024 5 0 0 # for message size 1024, binomial 6, topo 0, 0 segmentation
8192 6 0 0 # message size 8k, linear 1, topo 0, 0 segmentation
16384 5 0 0 # message size 16k, binary tree 5, topo 0, 0 segmentation
32768 6 0 0 # 32k, chain 2, no topo or segmentation
262144 3 0 0 # 256k, pipeline 3, no topo or segmentation
524288 4 0 0 # message size 512k+, split-binary 4, topo 0, 0 segmentation
8 # comm size 16
7 # number of msg sizes 7
0 1 0 0 # for message size 0, linear 1, topo 0, segmentation 0
1024 6 0 0 # for message size 1024, linear 1, topo 0, 0 segmentation
8192 6 0 0 # message size 8k, binomial tree 6, topo 0, 0 segmentation
16384 5 0 0 # message size 16k, binary tree 5, topo 0, 0 segmentation
32768 2 0 0 # 32k, chain 2, no topo or segmentation
262144 3 0 0 # 256k, pipeline 3, no topo or segmentation
524288 4 0 0 # message size 512k+, split-binary 4, topo 0, 0 segmentation
# end of first collective