Hi Andreas, thanks for the reply! I'm using openmpi-1.2.5. It was installed using my distro's (Gentoo) default package:
sys-cluster/openmpi-1.2.5 USE="fortran ipv6 -debug -heterogeneous -nocxx -pbs -romio -smp -threads" I've tried setting the mpi_yield_when_idle parameter as you asked. However, the program still hangs. Just in case, the command line I'm using to call it is this: /usr/bin/mpirun --hostfile mpi-config.txt --mca mpi_yield_when_idle 1 -np 3 /home/gfaccin/desenvolvimento/Eclipse/mpiplay/Debug/mpiplay where mpi-config.txt contains the following line: localhost slots=1 Anything else I could try? Thank you! Giovani Andreas Schäfer <gent...@gmx.de> escreveu: Hmm, strange. It doesn't hang for me and AFAICS it shouldn't hang at all. I'm using 1.2.5. Which version of Open MPI are you using? Hanging with 100% CPU utilization often means that your processes are caught in a busy wait. You could try to set mpi_yield_when_idle: > gentryx@hex ~ $ cat .openmpi/mca-params.conf > mpi_yield_when_idle=1 But I don't think this should be necessary. HTH -Andreas On 21:35 Mon 17 Mar , Giovani Faccin wrote: > Hi there! > > I'm learning MPI, and got really puzzled... Please take a look at this very > short code: > > #include > #include "mpicxx.h" > using namespace std; > int main(int argc, char *argv[]) > { > MPI::Init(); > > for (unsigned long t = 0; t < 10000000; t++) > { > //If we are process 0: > if ( MPI::COMM_WORLD.Get_rank() == 0 ) > { > MPI::Status mpi_status; > unsigned long d = 0; > unsigned long d2 = 0; > MPI::COMM_WORLD.Recv(&d, 1, MPI::UNSIGNED_LONG, MPI::ANY_SOURCE, > MPI::ANY_TAG, mpi_status ); > MPI::COMM_WORLD.Recv(&d2, 1, MPI::UNSIGNED_LONG, MPI::ANY_SOURCE, > MPI::ANY_TAG, mpi_status ); > cout << "Time = " << t << "; Node 0 received: " << d << " and " > << d2 << endl; > } > //Else: > else > { > unsigned long d = MPI::COMM_WORLD.Get_rank(); > MPI::COMM_WORLD.Send( &d, 1, MPI::UNSIGNED_LONG, 0, 0); > }; > }; > MPI::Finalize(); > } > > Ok, so what I'm trying to do is to make a gather operation using point to > point communication. In my real application instead of sending an unsigned > long I'd be calling an object's send and receive methods, which in turn would > call their inner object's similar methods and so on until all data is > syncronized. I'm using this loop because the number of objects to be sent to > process rank 0 varies depending on the sender. > > When running this test with 3 processes on a dual core, oversubscribed node, > I get this output: > (skipped previous output) > Time = 5873; Node 0 received: 1 and 2 > Time = 5874; Node 0 received: 1 and 2 > Time = 5875; Node 0 received: 1 and 2 > Time = 5876; Node 0 received: 1 and 2 > > and then the application hangs, with processor usage at 100%. The exact time > when this condition occurs varies on each run, but it usually happens quite > fast. > > What would I have to modify, in this simple example, so that the application > works as expected? Must I always use Gather, instead of point to point, to > make a syncronization like this? > > Thank you very much! > > Giovani > > > > > > > > __________________________________________________ > Fale com seus amigos de graça com o novo Yahoo! Messenger > http://br.messenger.yahoo.com/ > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- ============================================ Andreas Schäfer Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany PGP/GPG key via keyserver I'm a bright... http://www.the-brights.net ============================================ (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination! _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users --------------------------------- Abra sua conta no Yahoo! Mail, o único sem limite de espaço para armazenamento!