Re: [OMPI users] orted: command not found

2007-01-02 Thread jcolmenares
I had configured the hostfile located at
~prefix/etc/openmpi-default-hostfile.

I copied the file to bernie-3, and it worked...

Now, at the cluster I was working at the Universidad de Los Andes
(Venezuela) -I decided to install mpi on three machines I was able to put
together as a personal proyect- all I had to do was to compile and run my
applications, that is, I never copied any file to any other machine...
now, I had to. I'm sorry if it was obvious and made you guys loose some
time, but why on a cluster I didn't have to copy any files, and now I must
do so?

Thanks for you patiance!

Jose



Re: [OMPI users] orted: command not found

2007-01-02 Thread jcolmenares
it is executable

bernie@bernie-1:~/proyecto$ ls -l prueba.bin
-rwxr-xr-x 1 bernie bernie 9619 2007-01-02 12:18 prueba.bin




Re: [OMPI users] orted: command not found

2007-01-02 Thread jcolmenares
> First you should make sure that PATH and LD_LIBRARY_PATH are defined
> in the section of your .bashrc file that get parsed for non
> interactive sessions. Run "mpirun -np 1 printenv" and check if PATH
> and LD_LIBRARY_PATH have the values you expect.

in fact they do:

bernie@bernie-1:~/proyecto$ mpirun -np 1 printenv
SHELL=/bin/bash
SSH_CLIENT=192.168.1.142 4109 22
USER=bernie
LD_LIBRARY_PATH=/usr/local/openmpi/lib:/usr/local/openmpi/lib:
MAIL=/var/mail/bernie
PATH=/usr/local/openmpi/bin:/usr/local/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/usr/games
PWD=/home/bernie
LANG=en_US.UTF-8
HISTCONTROL=ignoredups
SHLVL=1
HOME=/home/bernie
MPI_DIR=/usr/local/openmpi
LOGNAME=bernie
SSH_CONNECTION=192.168.1.142 4109 192.168.1.113 22
LESSOPEN=| /usr/bin/lesspipe %s
LESSCLOSE=/usr/bin/lesspipe %s %s
_=/usr/local/openmpi/bin/orted
OMPI_MCA_universe=bernie@bernie-1:default-universe
OMPI_MCA_ns_nds=env
OMPI_MCA_ns_nds_vpid_start=0
OMPI_MCA_ns_nds_num_procs=1
OMPI_MCA_mpi_paffinity_processor=0
OMPI_MCA_ns_replica_uri=0.0.0;tcp://192.168.1.142:4775
OMPI_MCA_gpr_replica_uri=0.0.0;tcp://192.168.1.142:4775
OMPI_MCA_orte_base_nodename=192.168.1.113
OMPI_MCA_ns_nds_cellid=0
OMPI_MCA_ns_nds_jobid=1
OMPI_MCA_ns_nds_vpid=0


> For your second question you should give the path to your prueba.bin
> executable. I'll do something like "mpirun --prefix /usr/local/
> openmpi -np 2 ./prueba.bin". The reason is that usually "." is not in
> the PATH.
>

bernie@bernie-1:~/proyecto$ mpirun --prefix /usr/local/openmpi -np 2
./prueba.bin
--
Failed to find or execute the following executable:

Host:   bernie-3
Executable: ./prueba.bin

Cannot continue.
--

and the file IS there:

bernie@bernie-1:~/proyecto$ ls prueba*
prueba.bin  prueba.f90  prueba.f90~


I must be missing something pretty silly, but have been looking around for
days to no avail!

Jose

thanks




[OMPI users] orted: command not found

2007-01-02 Thread jcolmenares
I installed openmpi 1.1.2 on two 686 boxes runing ubuntu 6.10.
Followed the instructions given in the FAQ. Nevertheless, I get the
following message:

[bernie-1:05053] ERROR: A daemon on node 192.168.1.113 failed to start as
expected.
[bernie-1:05053] ERROR: There may be more information available from
[bernie-1:05053] ERROR: the remote shell (see above).
[bernie-1:05053] ERROR: The daemon exited unexpectedly with status 127.

now, I've been browsing the web, including the mailing lists, and it
appears that the error should be that I have not declared the variables

export PATH="/usr/local/openmpi/bin:${PATH}"
export LD_LIBRARY_PATH="/usr/local/openmpi/lib:${LD_LIBRARY_PATH}"

at the node, wich I have. I have even created all the posible folders
proposed at the FAQ for remote loggins, although I'm using bash.

If I do a ssh user@remote_node, I can connect without being asked for a
password, and if I type mpif90, I get: "gfortran: no input files", wich
should mean that indeed the PATH and LD_LIBRARY_PATH are being updated on
the remote logging.

But, if I do:

bash$  mpirun --prefix /usr/local/openmpi -np 2 prueba.bin

the result is:

--
Failed to find the following executable:

Host:   bernie-3
Executable: prueba.bin

Cannot continue.
--
mpirun noticed that job rank 0 with PID 0 on node "192.168.1.113" exited
on signal 4.

I've been looking around, but have not been able to find what does the
signal 4 means.

Just in case, I was running an example program wich runs fine at my
university cluster. Nevertheless, decided to run an even simpler one, wich
I include, for it may be that the error is there (I definitly hope
not!...)

program test

  use mpi

  implicit none

  integer :: myid,sizze,ierr

  call MPI_INIT(ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,sizze,ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD,myid,ierr)

  print *,"I'm using ",sizze," processors"
  print *,"of wich I'm the number ",myid

  call MPI_FINALIZE(ierr)

end program test


This is the first time I have installed -and use- any parallel programing
program or library, and I'm doing it as a personal proyect for a graduate
curse, so any help will be greatly appreciated!

Best regards

Jose Colmenares