Danesh: Have you tried "mpirun -np 4 --hostfile hosts hostname" to verify that ompi is working?
Can you remote access from each node to each other node? If any node has more than 1 network device, are you using the ompi options to specify which device to use? Good luck, Mark > Message: 5 > Date: Wed, 9 Apr 2008 14:15:34 +0200 (CEST) > From: "[email protected]" <[email protected]> > Subject: [OMPI users] Ang: Re: submitted job stops > To: <[email protected]> > Message-ID: > <24351656.56761207743334738.JavaMail.defaultUser@defaultHost> > Content-Type: text/plain;charset="ISO-8859-15" > > > Actually my program is very simple MPI program "Hello World" which > just prints rank of each processor and then terminates. When I run > my program on a single processor machine with e.g 4 processors > (oversubscribing) it shows: > > Hello world from processor with rank 0 > Hello world from processor with rank 3 > Hello world from processor with rank 1 > Hello world from processor with rank 2 > > but when I use my remote machines everything just stops when > I run the program. > > No I do not use any queuing system. I simply run it like this: > > mpirun -np 4 --hostfile hosts ./hw > > and then it just tops until I terminate it manually. As I said, > I monitored all machines (master+2 slaves) and found out that > in all machines, "orted" daemon starts when I run the program, but > after few seconds the daemon is terminated. What can be the reason? > > Thanks, > > Danesh > > > > >----Ursprungligt meddelande---- > >Fr?n: [email protected] > >Datum: 09-04-2008 13:26 > >Till: "Open MPI Users"<[email protected]> > >?rende: Re: [OMPI users] submitted job stops > > > >Hi, > > > >Am 08.04.2008 um 21:58 schrieb Danesh Daroui: > >> I had posted a message about my problem and I did all solutions but > >> the > >> problem is not solved it. The problem is that > >> I have installed Open-MPI on three machines (1 master+2 slaves). > >> When I > >> submit a job to master I can see that > >> "orted" daemon is launched on all machines (by running "top" on all > >> machines) but all "orted" daemons terminate after > >> few seconds and nothing will happen. First I thought that it can be > >> because remote machines can not launch "orted" but > >> now I am sure that it can be run on all machines without problem. Any > >> suggestion? > > > >the question is more: is your MPI program running successfully or is > >there simply no output from mpiexec/-run? And: by "submit" you mean > >you use any queuingsystem? > > > >-- Reuti > >_______________________________________________ > >users mailing list > >[email protected] > >http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > ------------------------------ > > _______________________________________________ > users mailing list > [email protected] > http://www.open-mpi.org/mailman/listinfo.cgi/users > > End of users Digest, Vol 863, Issue 1 > ************************************* >
