Folks,

there was a question about mtt on the mtt mailing list http://www.open-mpi.org/community/lists/mtt-users/2016/01/0840.php

after a few emails (some offline) it seems that was a configuration issue.
the user is running PBSPro and it seems OpenMPI was not configured with the tm module (e.g. tm is not included in the default location, and he did not configure with --with-tm=/.../pbspro)

in this case, the tm module is not built, and when a job runs under PBSPro without any hostfile,
the job runs on one node only.
in order to make this easier to diagnose, what about always building the tm module :
- if tm is found by configury, build the OpenMPI tm modules as usual
- if tm is not found by configury, build a dumb module that will issue a warning or abort
  if a job is ran under PBS/torque
  (e.g. some PBS specific environment variable are defined)

of course, the spec of this "dumb" module can be improved, for example
- add a MCA parameter to disable the warning
- issue the warning only if there is more that one node in the job *and* no machinefile nor host list was passed to the mpirun command line

Any thoughts ?

Cheers,

Gilles

Reply via email to