OSCAR by default ships two _different_ implementations of MPI, LAM/MPI and MPICH.  Although both are installed by default, only one implementation can be used at a time - LAM/MPI is the default implementation chosen.
 
Mixing different versions of mpirun & mpicc will not work.  You used MPICH's compiler to compile the code, and then use LAM/MPI's mpirun run - this will not work.
 
By default if you just type "mpicc", this will use the LAM/MPI version.
 
So presuming you use the LAM/MPI version of mpicc to compile your code, despite the fact that there were warnings, did you get an executable?  If not, then perhaps there is an error in your code - in the line:

#include " mpi.h"
There is a space between the first quote and "mpi.h" - you need to remove it:
 
#include "mpi.h"
 
With that edit I was able to compile and run the code.  It is not necessary to -lmpich, I simply ran the following and it generated an executable for me:
 
$ mpicc hello.cc
 
You can switch between different implementations of MPI by using switcher (see man switcher).  The following shows the current default:
 
$ switcher mpi --show
user:default=lam-7.0.6
user:global_exists=1
 
And I'm positive that I'm using the LAM/MPI implementation because:
 
$ which mpicc
/opt/lam-7.0/bin/mpicc
 
Cheers,
 
Bernard


From: Michelle Chu [mailto:[EMAIL PROTECTED]
Sent: Tue 04/04/2006 10:15
To: Michael Edwards
Cc: Bernard Li; [email protected]
Subject: Re: [Oscar-users] MPI ranking and size problem

Michael,
I think I now knew the problem. It is because of the compiler i was using.
i used:
/opt/mpich-ch_p4-gcc-1.2.7/bin/mpicxx -c hello++.cc
/opt/mpich-ch_p4-gcc-1.2.7/bin/mpicxx -o hello++ hello++.o 
The excutable gave the problem with ranking and size. 
 
However, if i use
mpicc -c hello++.cc
ok,
mpicc -o hello++ hello++.o 
it reports buch of errors message such as: undefined reference to MPI::COMM_WORLD, MPI::INIT etc
 
I tried to run mpicc -o hello++ hello++.o -lmpich, it still won't solve the undefined refernce problem.
 
Thanks,
Michelle
 


 
On 4/4/06, Michael Edwards <[EMAIL PROTECTED]> wrote:
Did you run the program from your home directory or the /opt/
directory where it lives?
For LAM to work properly the program needs to be available on all the
compute nodes.
By default, /home is the only directory exported by nfs.

I am still confused by the mpi_init error even if that is the problem,
did you compile the program using gcc or mpicc?

On 4/4/06, Michelle Chu < [EMAIL PROTECTED]> wrote:
>
> Bernard,
> Here are some outputs...
> [EMAIL PROTECTED] ~]$ lamnodes
> n0       athena.cs.xxx.edu:1:origin,this_node
> n1      oscarnode1.cs.xxx.edu:1 :
> n2      oscarnode2.cs.xxx.edu:1:
> n3       oscarnode3.cs.xxx.edu:1:
> n4      oscarnode4.cs.xxx.edu:1 :
> n5      oscarnode5.cs.xxx.edu:1 :
> n6      oscarnode6.cs.xxx.edu:1:
> n7      oscarnode7.cs.xxx.edu:1 :
> n8       oscarnode8.cs.xxx.edu:1:
> ***********************************************************************************************
> [EMAIL PROTECTED] ~]$ vi lamtest. output
>
> wall clock time = 0.000074
> Process 1 of 8 on oscarnode8.cs.xxx.edu
>
> --> MPI C++ bindings test:
>
> Hello World! I am 1 of 8
> Hello World! I am 0 of 8
> Hello World! I am 6 of 8
> Hello World! I am 4 of 8
> Hello World! I am 2 of 8
> Hello World! I am 5 of 8
> Hello World! I am 7 of 8
> Hello World! I am 3 of 8
>
>
>
> --> MPI Fortran bindings test:
>
>  Hello World! I am  0 of  8
>  Hello World! I am  1 of  8
>  Hello World! I am  4 of  8
>  Hello World! I am  6 of  8
>  Hello World! I am  2 of  8
>  Hello World! I am  7 of  8
>  Hello World! I am  5 of  8
>   Hello World! I am  3 of  8
>
>
> LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
>
>
> LAM/MPI test complete
> Unless there are errors above, test completed successfully.
>
> ********************************************************************************************************************
> [EMAIL PROTECTED] examples]$ mpirun N hello++
>
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> -----------------------------------------------------------------------------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE).  You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> *********************************************************************************************************************
> Attached is the output file file: lamboot -d hostfile
>
> What i did was:
> 1). lamboot -d hostfile
> 2). mpirun N hello++
> hostfile lists host name of headnode and all eight cluster nodes as:
>
> athena.cs.xxx.edu
> oscarnode1.cs.xxx.edu
>
> .....
> oscarnode8.cs.xxx.edu
>
> Thanks,
>
>
> Michelle
>
> On 4/4/06, Bernard Li <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > Hi Michelle:
> >
> > I just tested your code on a 2 node cluster (including headnode) and got
> the following result:
> >
> > $ mpirun N a.out
> > Hello World! I am 1 of 2
> > Hello World! I am 0 of 2
> >
> > So it seems fine (you had a space between " and ' mpi.h' but I fixed that)
> >
> > Can you show us the output of "lamnodes" after you have successfully
> booted the nodes?  Also, post the output of your LAM/MPI OSCAR tests
> (/home/oscartst/lam/lamtest.out).
> >
> > Cheers,
> >
> > Bernard
> >
> > ________________________________
>  From: [EMAIL PROTECTED] on behalf of
> Michelle Chu
> > Sent: Mon 03/04/2006 21:02
> > To: Michael Edwards
> > Cc: [email protected]
> > Subject: Re: [Oscar-users] MPI ranking and size problem
> >
> >
> >
> >
> > Michael,
> > OSCAR version is 4.2. The OS on the client node is Red Hat and is
> installed from the client image file generated during the OSCAR
> installation. The cluster testing step at the end of installation passed
> except the ganglia part. Thank you very much for your help. Michelle
> >
> > Here is the code for Hello.cc.
> >
> *******************************************************************************************
> > #include <iostream.h>
> > // modified to reference the master mpi.h file, to meet the MPI standard
> spec.
> > #include " mpi.h"
> > int
> > main(int argc, char *argv[])
> > {
> >   MPI::Init(argc, argv);
> >
> >   int rank = MPI::COMM_WORLD.Get_rank();
> >   int size = MPI::COMM_WORLD.Get_size();
> >
> >   cout << "Hello World! I am " << rank << " of " << size << endl;
> >
> >   MPI::Finalize();
> > }
> >
> >
> *****************************************************************************************
> >
> >
> > On 4/3/06, Michael Edwards < [EMAIL PROTECTED] > wrote:
> > > What version of OSCAR are you using, and on what platform?
> > >
> > > Also, could you send us a copy of hello++.cpp, it looks like there are
> > > some errors there?  Also, did all the oscar tests pass?
> > >
> > > LAM appears to be working correctly at surface anyway.
> > >
> > >
> > >
> > > On 4/3/06, Michelle Chu < [EMAIL PROTECTED] > wrote:
> > > > Hello, there,
> > > >
> > > >  When I mpirun a simple hello MPI program on all my eight nodes as the
> > > > following. I get a sequence of hello world! i am 0 of 1, instead of 1
> of 8,
> > > > 2 of 8, 3 of 8. Also, problem with MPI_INIT. Thank you for your help.
> > > >
> > > >  which mpicc
> > > >  /opt/lam-7.0.6/bin/mpicc
> > > >  lamboot -v my_hostfile
> > > >  my_hostfile is:
> > > >  **************************************
> > > >  athena.cs.xxx.edu
> > > >   oscarnode1.cs.xxx.edu
> > > >   oscarnode2.cs.xxx.edu
> > > >   oscarnode3.cs.xxx.edu
> > > >  oscarnode4.cs.xxx.edu
> > > >   oscarnode5.cs.xxx.edu
> > > >  oscarnode6.cs.xxx.edu
> > > >   oscarnode7.cs.xxx.edu
> > > >  oscarnode8.cs.xxx.edu
> > > >
> > > >
> *****************************************************************************
> > > >  LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
> > > >
> > > >  n-1<16365> ssi:boot:base:linear: booting n0 ( athena.cs.xxx.edu )
> > > >  n-1<16365> ssi:boot:base:linear: booting n1 ( oscarnode1.cs.xxx.edu)
> > > >  n-1<16365> ssi:boot:base:linear: booting n2 ( oscarnode2.cs.xxx.edu)
> > > >  n-1<16365> ssi:boot:base:linear: booting n3 ( oscarnode3.cs.xxx.edu)
> > > >  n-1<16365> ssi:boot:base:linear: booting n4 ( oscarnode4.cs.xxx.edu)
> > > >  n-1<16365> ssi:boot:base:linear: booting n5 ( oscarnode5.cs.xxx.edu )
> > > >  n-1<16365> ssi:boot:base:linear: booting n6 ( oscarnode6.cs.xxx.edu)
> > > >  n-1<16365> ssi:boot:base:linear: booting n7 ( oscarnode7.cs.xxx.edu)
> > > >  n-1<16365> ssi:boot:base:linear: booting n8 ( oscarnode8.cs.xxx.edu)
> > > >  n-1<16365> ssi:boot:base:linear: finished
> > > >
> > > >  mpirun N hello++
> > > >
> *****************************************************************
> > > >  Hello World! I am 0 of 1
> > > >  Hello World! I am 0 of 1
> > > >  Hello World! I am 0 of 1
> > > >  Hello World! I am 0 of 1
> > > >  Hello World! I am 0 of 1
> > > >  Hello World! I am 0 of 1
> > > >
> -----------------------------------------------------------------------------
> > > >  It seems that [at least] one of the processes that was started with
> > > >  mpirun did not invoke MPI_INIT before quitting (it is possible that
> > > >  more than one process did not invoke MPI_INIT -- mpirun was only
> > > >  notified of the first one, which was on node n0).
> > > >
> > > >  mpirun can *only* be used with MPI programs (i.e., programs that
> > > >  invoke MPI_INIT and MPI_FINALIZE).  You can use the "lamexec" program
> > > >  to run non-MPI programs over the lambooted nodes.
> > > >
> -----------------------------------------------------------------------------
> > > >  Hello World! I am 0 of 1
> > > >
> ****************************************************************************************************
> > > >
> > >
> >
> >
>
>

Reply via email to