You still have to set the PATH and LD_LIBRARY_PATH on your remote nodes to 
include where you installed OMPI.

Alternatively, use the absolute path name to mpirun in your cmd - we'll pick up 
the path and propagate it.


On May 3, 2011, at 9:14 PM, Ahsan Ali wrote:

> Dear Bart,
> 
> I think OpenMPI don't need to be installed on all machines because they are 
> NFS shared with the master node. I don't know how to check output of which 
> orted, it is running just on the master node. I have another application 
> which is running similarly but I am having problem with WRF.
> 
> On Tue, May 3, 2011 at 9:06 PM, Bart Brashers <bbrash...@environcorp.com> 
> wrote:
> It looks like OpenMPI is not installed on all your execution machines.  You 
> need to install at least the libs on all machines, or on an NFS-shared 
> location.  Check the output of "which orted" on the machine that works.
> 
>  
> Bart
> 
>  
> From: wrf-users-boun...@ucar.edu [mailto:wrf-users-boun...@ucar.edu] On 
> Behalf Of Ahsan Ali
> Sent: Tuesday, May 03, 2011 1:04 AM
> To: us...@open-mpi.org
> Subject: [Wrf-users] WRF Problem running in Parallel on multiple 
> nodes(cluster)
> 
>  
> Hello,
> 
>  
> I am able to run WRFV3.2.1 using mpirun on multiple cores of single machine, 
> but when I want to run it across multiple nodes in cluster using hostlist 
> then I get error, The compute nodes are mounted with the master node during 
> boot using NFS. I get following error. Please help.
> 
>  
> [root@pmd02 em_real]# mpirun -np 10 -hostfile /home/pmdtest/hostlist 
> ./real.exe
> 
> bash: orted: command not found
> 
> bash: orted: command not found
> 
> --------------------------------------------------------------------------
> 
> A daemon (pid 22006) died unexpectedly with status 127 while attempting
> 
> to launch so we are aborting.
> 
>  
> There may be more information reported by the environment (see above).
> 
>  
> This may be because the daemon was unable to find all the needed shared
> 
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> 
> location of the shared libraries on the remote nodes and this will
> 
> automatically be forwarded to the remote nodes.
> 
> --------------------------------------------------------------------------
> 
> --------------------------------------------------------------------------
> 
> mpirun noticed that the job aborted, but has no info as to the process
> 
> that caused that situation.
> 
> --------------------------------------------------------------------------
> 
> mpirun: clean termination accomplished
> 
>  
> 
> -- 
> Syed Ahsan Ali Bokhari 
> Electronic Engineer (EE)
> 
> 
> Research & Development Division
> Pakistan Meteorological Department H-8/4, Islamabad.
> Phone # off  +92518358714
> 
> Cell # +923155145014
> 
>  
> 
> This message contains information that may be confidential, privileged or 
> otherwise protected by law from disclosure. It is intended for the exclusive 
> use of the Addressee(s). Unless you are the addressee or authorized agent of 
> the addressee, you may not review, copy, distribute or disclose to anyone the 
> message or any information contained within. If you have received this 
> message in error, please contact the sender by electronic reply to 
> em...@environcorp.com and immediately delete all copies of the message.
> 
> 
> 
> 
> -- 
> Syed Ahsan Ali Bokhari 
> Electronic Engineer (EE)
> 
> Research & Development Division
> Pakistan Meteorological Department H-8/4, Islamabad.
> Phone # off  +92518358714
> Cell # +923155145014
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to