Thanks Rueti. It works now. I just disabled firewall on all machines since Open-MPI uses random port each time.

Thanks again!

Danesh



Reuti skrev:
Hi,

Am 09.04.2008 um 22:17 schrieb Danesh Daroui:
Mark Kosmowski skrev:
Danesh:

Have you tried "mpirun -np 4 --hostfile hosts hostname" to verify that
ompi is working?

When I run "mpirun -np 4 --hostfile hosts hostname" same thing happens
and it just hangs. Can it be a clue?

Can you remote access from each node to each other node?

Yes all nodes can have access to each other via SSH and can login
without being prompted for password.

If any node has more than 1 network device, are you using the ompi
options to specify which device to use?

Each node has one network interface which works properly.

do you have any firewall on the machines, blocking certain ports?

-- Reuti


Regards,

Danesh


Good luck,

Mark


Message: 5
Date: Wed, 9 Apr 2008 14:15:34 +0200 (CEST)
From: "danes...@bredband.net" <danes...@bredband.net>
Subject: [OMPI users] Ang: Re:  submitted job stops
To: <us...@open-mpi.org>
Message-ID:
       <24351656.56761207743334738.JavaMail.defaultUser@defaultHost>
Content-Type: text/plain;charset="ISO-8859-15"


Actually my program is very simple MPI program "Hello World" which
just prints rank of each processor and then terminates. When I run
my program on a single processor machine with e.g 4 processors
(oversubscribing) it shows:

Hello world from processor with rank 0
Hello world from processor with rank 3
Hello world from processor with rank 1
Hello world from processor with rank 2

but when I use my remote machines everything just stops when
I run the program.

No I do not use any queuing system. I simply run it like this:

mpirun -np 4 --hostfile hosts ./hw

and then it just tops until I terminate it manually. As I said,
I monitored all machines (master+2 slaves) and found out that
in all machines, "orted" daemon starts when I run the program, but
after few seconds the daemon is terminated. What can be the reason?

Thanks,

Danesh




----Ursprungligt meddelande----
Fr?n: re...@staff.uni-marburg.de
Datum: 09-04-2008 13:26
Till: "Open MPI Users"<us...@open-mpi.org>
?rende: Re: [OMPI users] submitted job stops

Hi,

Am 08.04.2008 um 21:58 schrieb Danesh Daroui:

I had posted a message about my problem and I did all solutions but
the
problem is not solved it. The problem is that
I have installed Open-MPI on three machines (1 master+2 slaves).
When I
submit a job to master I can see that
"orted" daemon is launched on all machines (by running "top" on all
machines) but all "orted" daemons terminate after
few seconds and nothing will happen. First I thought that it can be
because remote machines can not launch "orted" but
now I am sure that it can be run on all machines without problem. Any
suggestion?

the question is more: is your MPI program running successfully or is
there simply no output from mpiexec/-run? And: by "submit" you mean
you use any queuingsystem?

-- Reuti
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 863, Issue 1
*************************************


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to