Folks,
there was a question about mtt on the mtt mailing list
http://www.open-mpi.org/community/lists/mtt-users/2016/01/0840.php
after a few emails (some offline) it seems that was a configuration issue.
the user is running PBSPro and it seems OpenMPI was not configured with
the tm module
(e.g. tm is not included in the default location, and he did not
configure with --with-tm=/.../pbspro)
in this case, the tm module is not built, and when a job runs under
PBSPro without any hostfile,
the job runs on one node only.
in order to make this easier to diagnose, what about always building the
tm module :
- if tm is found by configury, build the OpenMPI tm modules as usual
- if tm is not found by configury, build a dumb module that will issue a
warning or abort
if a job is ran under PBS/torque
(e.g. some PBS specific environment variable are defined)
of course, the spec of this "dumb" module can be improved, for example
- add a MCA parameter to disable the warning
- issue the warning only if there is more that one node in the job *and*
no machinefile nor host list was passed to the mpirun command line
Any thoughts ?
Cheers,
Gilles