Actually my program is very simple MPI program "Hello World" which just prints rank of each processor and then terminates. When I run my program on a single processor machine with e.g 4 processors (oversubscribing) it shows:
Hello world from processor with rank 0 Hello world from processor with rank 3 Hello world from processor with rank 1 Hello world from processor with rank 2 but when I use my remote machines everything just stops when I run the program. No I do not use any queuing system. I simply run it like this: mpirun -np 4 --hostfile hosts ./hw and then it just tops until I terminate it manually. As I said, I monitored all machines (master+2 slaves) and found out that in all machines, "orted" daemon starts when I run the program, but after few seconds the daemon is terminated. What can be the reason? Thanks, Danesh >----Ursprungligt meddelande---- >Från: re...@staff.uni-marburg.de >Datum: 09-04-2008 13:26 >Till: "Open MPI Users"<us...@open-mpi.org> >Ärende: Re: [OMPI users] submitted job stops > >Hi, > >Am 08.04.2008 um 21:58 schrieb Danesh Daroui: >> I had posted a message about my problem and I did all solutions but >> the >> problem is not solved it. The problem is that >> I have installed Open-MPI on three machines (1 master+2 slaves). >> When I >> submit a job to master I can see that >> "orted" daemon is launched on all machines (by running "top" on all >> machines) but all "orted" daemons terminate after >> few seconds and nothing will happen. First I thought that it can be >> because remote machines can not launch "orted" but >> now I am sure that it can be run on all machines without problem. Any >> suggestion? > >the question is more: is your MPI program running successfully or is >there simply no output from mpiexec/-run? And: by "submit" you mean >you use any queuingsystem? > >-- Reuti >_______________________________________________ >users mailing list >us...@open-mpi.org >http://www.open-mpi.org/mailman/listinfo.cgi/users >