Adding to that - can you also post how you compiled the program?  Show
us the output too.

Thanks,

Bernard 

> -----Original Message-----
> From: Michael Edwards [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, April 04, 2006 7:47
> To: Michelle Chu
> Cc: Bernard Li; [email protected]
> Subject: Re: [Oscar-users] MPI ranking and size problem
> 
> Did you run the program from your home directory or the /opt/
> directory where it lives?
> For LAM to work properly the program needs to be available on all the
> compute nodes.
> By default, /home is the only directory exported by nfs.
> 
> I am still confused by the mpi_init error even if that is the problem,
> did you compile the program using gcc or mpicc?
> 
> On 4/4/06, Michelle Chu <[EMAIL PROTECTED]> wrote:
> >
> > Bernard,
> > Here are some outputs...
> > [EMAIL PROTECTED] ~]$ lamnodes
> > n0      athena.cs.xxx.edu:1:origin,this_node
> > n1      oscarnode1.cs.xxx.edu:1 :
> > n2      oscarnode2.cs.xxx.edu:1:
> > n3      oscarnode3.cs.xxx.edu:1:
> > n4      oscarnode4.cs.xxx.edu:1 :
> > n5      oscarnode5.cs.xxx.edu:1:
> > n6      oscarnode6.cs.xxx.edu:1:
> > n7      oscarnode7.cs.xxx.edu:1 :
> > n8      oscarnode8.cs.xxx.edu:1:
> > 
> **************************************************************
> *********************************
> > [EMAIL PROTECTED] ~]$ vi lamtest. output
> >
> > wall clock time = 0.000074
> > Process 1 of 8 on oscarnode8.cs.xxx.edu
> >
> > --> MPI C++ bindings test:
> >
> > Hello World! I am 1 of 8
> > Hello World! I am 0 of 8
> > Hello World! I am 6 of 8
> > Hello World! I am 4 of 8
> > Hello World! I am 2 of 8
> > Hello World! I am 5 of 8
> > Hello World! I am 7 of 8
> > Hello World! I am 3 of 8
> >
> >
> >
> > --> MPI Fortran bindings test:
> >
> >  Hello World! I am  0 of  8
> >  Hello World! I am  1 of  8
> >  Hello World! I am  4 of  8
> >  Hello World! I am  6 of  8
> >  Hello World! I am  2 of  8
> >  Hello World! I am  7 of  8
> >  Hello World! I am  5 of  8
> >   Hello World! I am  3 of  8
> >
> >
> > LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
> >
> >
> > LAM/MPI test complete
> > Unless there are errors above, test completed successfully.
> >
> > 
> **************************************************************
> ******************************************************
> > [EMAIL PROTECTED] examples]$ mpirun N hello++
> >
> > Hello World! I am 0 of 1
> > Hello World! I am 0 of 1
> > Hello World! I am 0 of 1
> > Hello World! I am 0 of 1
> > Hello World! I am 0 of 1
> > Hello World! I am 0 of 1
> > Hello World! I am 0 of 1
> > Hello World! I am 0 of 1
> > 
> --------------------------------------------------------------
> ---------------
> > It seems that [at least] one of the processes that was started with
> > mpirun did not invoke MPI_INIT before quitting (it is possible that
> > more than one process did not invoke MPI_INIT -- mpirun was only
> > notified of the first one, which was on node n0).
> >
> > mpirun can *only* be used with MPI programs (i.e., programs that
> > invoke MPI_INIT and MPI_FINALIZE).  You can use the 
> "lamexec" program
> > to run non-MPI programs over the lambooted nodes.
> > 
> **************************************************************
> *******************************************************
> > Attached is the output file file: lamboot -d hostfile
> >
> > What i did was:
> > 1). lamboot -d hostfile
> > 2). mpirun N hello++
> > hostfile lists host name of headnode and all eight cluster nodes as:
> >
> > athena.cs.xxx.edu
> > oscarnode1.cs.xxx.edu
> >
> > .....
> > oscarnode8.cs.xxx.edu
> >
> > Thanks,
> >
> >
> > Michelle
> >
> > On 4/4/06, Bernard Li <[EMAIL PROTECTED]> wrote:
> > >
> > >
> > >
> > > Hi Michelle:
> > >
> > > I just tested your code on a 2 node cluster (including 
> headnode) and got
> > the following result:
> > >
> > > $ mpirun N a.out
> > > Hello World! I am 1 of 2
> > > Hello World! I am 0 of 2
> > >
> > > So it seems fine (you had a space between " and 'mpi.h' 
> but I fixed that)
> > >
> > > Can you show us the output of "lamnodes" after you have 
> successfully
> > booted the nodes?  Also, post the output of your LAM/MPI OSCAR tests
> > (/home/oscartst/lam/lamtest.out).
> > >
> > > Cheers,
> > >
> > > Bernard
> > >
> > > ________________________________
> >  From: [EMAIL PROTECTED] on behalf of
> > Michelle Chu
> > > Sent: Mon 03/04/2006 21:02
> > > To: Michael Edwards
> > > Cc: [email protected]
> > > Subject: Re: [Oscar-users] MPI ranking and size problem
> > >
> > >
> > >
> > >
> > > Michael,
> > > OSCAR version is 4.2. The OS on the client node is Red Hat and is
> > installed from the client image file generated during the OSCAR
> > installation. The cluster testing step at the end of 
> installation passed
> > except the ganglia part. Thank you very much for your help. Michelle
> > >
> > > Here is the code for Hello.cc.
> > >
> > 
> **************************************************************
> *****************************
> > > #include <iostream.h>
> > > // modified to reference the master mpi.h file, to meet 
> the MPI standard
> > spec.
> > > #include " mpi.h"
> > > int
> > > main(int argc, char *argv[])
> > > {
> > >   MPI::Init(argc, argv);
> > >
> > >   int rank = MPI::COMM_WORLD.Get_rank();
> > >   int size = MPI::COMM_WORLD.Get_size();
> > >
> > >   cout << "Hello World! I am " << rank << " of " << size << endl;
> > >
> > >   MPI::Finalize();
> > > }
> > >
> > >
> > 
> **************************************************************
> ***************************
> > >
> > >
> > > On 4/3/06, Michael Edwards <[EMAIL PROTECTED] > wrote:
> > > > What version of OSCAR are you using, and on what platform?
> > > >
> > > > Also, could you send us a copy of hello++.cpp, it looks 
> like there are
> > > > some errors there?  Also, did all the oscar tests pass?
> > > >
> > > > LAM appears to be working correctly at surface anyway.
> > > >
> > > >
> > > >
> > > > On 4/3/06, Michelle Chu < [EMAIL PROTECTED] > wrote:
> > > > > Hello, there,
> > > > >
> > > > >  When I mpirun a simple hello MPI program on all my 
> eight nodes as the
> > > > > following. I get a sequence of hello world! i am 0 of 
> 1, instead of 1
> > of 8,
> > > > > 2 of 8, 3 of 8. Also, problem with MPI_INIT. Thank 
> you for your help.
> > > > >
> > > > >  which mpicc
> > > > >  /opt/lam-7.0.6/bin/mpicc
> > > > >  lamboot -v my_hostfile
> > > > >  my_hostfile is:
> > > > >  **************************************
> > > > >  athena.cs.xxx.edu
> > > > >   oscarnode1.cs.xxx.edu
> > > > >  oscarnode2.cs.xxx.edu
> > > > >   oscarnode3.cs.xxx.edu
> > > > >  oscarnode4.cs.xxx.edu
> > > > >   oscarnode5.cs.xxx.edu
> > > > >  oscarnode6.cs.xxx.edu
> > > > >   oscarnode7.cs.xxx.edu
> > > > >  oscarnode8.cs.xxx.edu
> > > > >
> > > > >
> > 
> **************************************************************
> ***************
> > > > >  LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
> > > > >
> > > > >  n-1<16365> ssi:boot:base:linear: booting n0 ( 
> athena.cs.xxx.edu)
> > > > >  n-1<16365> ssi:boot:base:linear: booting n1 ( 
> oscarnode1.cs.xxx.edu)
> > > > >  n-1<16365> ssi:boot:base:linear: booting n2 
> (oscarnode2.cs.xxx.edu)
> > > > >  n-1<16365> ssi:boot:base:linear: booting n3 ( 
> oscarnode3.cs.xxx.edu)
> > > > >  n-1<16365> ssi:boot:base:linear: booting n4 ( 
> oscarnode4.cs.xxx.edu)
> > > > >  n-1<16365> ssi:boot:base:linear: booting n5 ( 
> oscarnode5.cs.xxx.edu )
> > > > >  n-1<16365> ssi:boot:base:linear: booting n6 
> (oscarnode6.cs.xxx.edu)
> > > > >  n-1<16365> ssi:boot:base:linear: booting n7 ( 
> oscarnode7.cs.xxx.edu)
> > > > >  n-1<16365> ssi:boot:base:linear: booting n8 ( 
> oscarnode8.cs.xxx.edu)
> > > > >  n-1<16365> ssi:boot:base:linear: finished
> > > > >
> > > > >  mpirun N hello++
> > > > >
> > *****************************************************************
> > > > >  Hello World! I am 0 of 1
> > > > >  Hello World! I am 0 of 1
> > > > >  Hello World! I am 0 of 1
> > > > >  Hello World! I am 0 of 1
> > > > >  Hello World! I am 0 of 1
> > > > >  Hello World! I am 0 of 1
> > > > >
> > 
> --------------------------------------------------------------
> ---------------
> > > > >  It seems that [at least] one of the processes that 
> was started with
> > > > >  mpirun did not invoke MPI_INIT before quitting (it 
> is possible that
> > > > >  more than one process did not invoke MPI_INIT -- 
> mpirun was only
> > > > >  notified of the first one, which was on node n0).
> > > > >
> > > > >  mpirun can *only* be used with MPI programs (i.e., 
> programs that
> > > > >  invoke MPI_INIT and MPI_FINALIZE).  You can use the 
> "lamexec" program
> > > > >  to run non-MPI programs over the lambooted nodes.
> > > > >
> > 
> --------------------------------------------------------------
> ---------------
> > > > >  Hello World! I am 0 of 1
> > > > >
> > 
> **************************************************************
> **************************************
> > > > >
> > > >
> > >
> > >
> >
> >
> 


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to