Jeff Squyres wrote:
Hmm -- puzzling -- the error file you sent shows the following:

bash: /opt/openmpi/orted: No such file or directory

But that shouldn't happen; according to your config.log, you installed with a prefix of /opt/openmpi, so Open MPI should be looking for orted in /opt/openmpi/bin/orted.

You said that the command was

command = mpirun --hostfile hostfile -np 2 echo `uname -a`

Is there any chance that you ran with mpirun's absolute filename, such as:

/opt/openmpi/bin/mpirun --hostfile hostfile -np 2 echo `uname -a`

Or do you have any aliases involved? I can't imagine how you're getting that error message -- Open MPI should never use a full path name for orted unless you specified --prefix on the mpirun command line (which you didn't), or youused a full path name for mpirun (which it looks like you didn't, and even if you did use /opt/openmpi/bin/mpirun, it should use that path to look for /opt/openmpi/bin/orted on the other node). Otherwise, Open MPI relies on the PATH set in your shell startup files on remote nodes to find the orted.

This is very odd -- can you look at the exact command that is being executed on the remote node?


On Mar 27, 2009, at 12:24 PM, Russell McQueeney wrote:

command = mpirun --hostfile hostfile -np 2 echo `uname -a`
PATH = ...:/opt/openmpi/bin
LD_LIBRARY_PATH = /opt/openmpi/lib
no MCA parameters used

I set up the default shell to bash, and put some echo's in .bash_profile
and .bashrc, and when i run the mpirun command, i see those echoes, but
then it stops, and the job is never completed

Ralph Castain wrote:
> Could you please send the info shown here:
>
> http://www.open-mpi.org/community/help/
>
> If the ess is failing, then we don't recognize the environment.
> Probably an issue with how it is configured vs being run.
>
> Thanks
> Ralph
>
> On Mar 26, 2009, at 3:42 PM, Russell McQueeney wrote:
>
>> I installed OpenMPI 1.3.1, and whenever I or mpirun try to start
>> orted on any of the machines, it shows that message, and
>> --> Returned value Not found (-13) instead of ORTE-SUCCESS
>> Is there anything obvious that I missed?
>> My machines are Intel x86-32, running fedora (10 and 2)
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

<config.log.bz2><ompi_info.bz2><orted_errors.bz2><ifconfig.bz2><ATT7963694.txt>


Oops. I just did `/opt/openmpi/orted 2>orted_erros ; bzip2 orted_errors` and didn't check it before I atached it. What ends up happening is ^C kill mpirun on the head node, and all the other nodes have a zombie, nonresponsive 'orted' process, which I have to kill manually. Interestingly enough, no matter what environment variables I set, and no matter which machine, when I try to run `orted` or `/opt/openmpi/bin/orted`, I get the exact same error. I have attached the real orted errors file here. The reason that bash was whining was an incorrect syntax on the stderr redierct, `orted 2> orted_errors` instead of the correct version; `orted 2>orted_errors`

Attachment: orted_errors.bz2
Description: application/bzip

Reply via email to