OSCAR by default ships two
_different_ implementations of MPI, LAM/MPI and MPICH. Although both are
installed by default, only one implementation can be used at a time - LAM/MPI is
the default implementation chosen.
Mixing different versions of
mpirun & mpicc will not work. You used MPICH's compiler to compile the
code, and then use LAM/MPI's mpirun run - this will not work.
By default if you just type
"mpicc", this will use the LAM/MPI version.
So presuming you use the
LAM/MPI version of mpicc to compile your code, despite the fact that there were
warnings, did you get an executable? If not, then perhaps there is an
error in your code - in the line:
#include " mpi.h"
#include " mpi.h"
There is a space between the
first quote and "mpi.h" - you need to remove it:
#include "mpi.h"
With that edit I was able to
compile and run the code. It is not necessary to -lmpich, I simply ran the
following and it generated an executable for me:
$ mpicc hello.cc
You can switch between
different implementations of MPI by using switcher (see man switcher). The
following shows the current default:
$ switcher mpi
--show
user:default=lam-7.0.6
user:global_exists=1
user:default=lam-7.0.6
user:global_exists=1
And I'm positive that I'm
using the LAM/MPI implementation because:
$ which
mpicc
/opt/lam-7.0/bin/mpicc
/opt/lam-7.0/bin/mpicc
Cheers,
Bernard
From: Michelle Chu [mailto:[EMAIL PROTECTED]
Sent: Tue 04/04/2006 10:15
To: Michael Edwards
Cc: Bernard Li; [email protected]
Subject: Re: [Oscar-users] MPI ranking and size problem
Michael,
I think I now knew the problem. It is because of the compiler i was using.
i used:
/opt/mpich-ch_p4-gcc-1.2.7/bin/mpicxx -c hello++.cc
/opt/mpich-ch_p4-gcc-1.2.7/bin/mpicxx -o hello++ hello++.o
The excutable gave the problem with ranking and size.
However, if i use
mpicc -c hello++.cc
ok,
mpicc -o hello++ hello++.o
it reports buch of errors message such as: undefined reference to
MPI::COMM_WORLD, MPI::INIT etc
I tried to run mpicc -o hello++ hello++.o -lmpich, it still won't
solve the undefined refernce problem.
Thanks,
Michelle
On 4/4/06, Michael
Edwards <[EMAIL PROTECTED]>
wrote:
Did you run the program from your home directory or the /opt/
directory where it lives?
For LAM to work properly the program needs to be available on all the
compute nodes.
By default, /home is the only directory exported by nfs.
I am still confused by the mpi_init error even if that is the problem,
did you compile the program using gcc or mpicc?
On 4/4/06, Michelle Chu < [EMAIL PROTECTED]> wrote:
>
> Bernard,
> Here are some outputs...
> [EMAIL PROTECTED] ~]$ lamnodes
> n0 athena.cs.xxx.edu:1:origin,this_node
> n1 oscarnode1.cs.xxx.edu:1 :
> n2 oscarnode2.cs.xxx.edu:1:
> n3 oscarnode3.cs.xxx.edu:1:
> n4 oscarnode4.cs.xxx.edu:1 :
> n5 oscarnode5.cs.xxx.edu:1 :
> n6 oscarnode6.cs.xxx.edu:1:
> n7 oscarnode7.cs.xxx.edu:1 :
> n8 oscarnode8.cs.xxx.edu:1:
> ***********************************************************************************************
> [EMAIL PROTECTED] ~]$ vi lamtest. output
>
> wall clock time = 0.000074
> Process 1 of 8 on oscarnode8.cs.xxx.edu
>
> --> MPI C++ bindings test:
>
> Hello World! I am 1 of 8
> Hello World! I am 0 of 8
> Hello World! I am 6 of 8
> Hello World! I am 4 of 8
> Hello World! I am 2 of 8
> Hello World! I am 5 of 8
> Hello World! I am 7 of 8
> Hello World! I am 3 of 8
>
>
>
> --> MPI Fortran bindings test:
>
> Hello World! I am 0 of 8
> Hello World! I am 1 of 8
> Hello World! I am 4 of 8
> Hello World! I am 6 of 8
> Hello World! I am 2 of 8
> Hello World! I am 7 of 8
> Hello World! I am 5 of 8
> Hello World! I am 3 of 8
>
>
> LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
>
>
> LAM/MPI test complete
> Unless there are errors above, test completed successfully.
>
> ********************************************************************************************************************
> [EMAIL PROTECTED] examples]$ mpirun N hello++
>
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> Hello World! I am 0 of 1
> -----------------------------------------------------------------------------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> *********************************************************************************************************************
> Attached is the output file file: lamboot -d hostfile
>
> What i did was:
> 1). lamboot -d hostfile
> 2). mpirun N hello++
> hostfile lists host name of headnode and all eight cluster nodes as:
>
> athena.cs.xxx.edu
> oscarnode1.cs.xxx.edu
>
> .....
> oscarnode8.cs.xxx.edu
>
> Thanks,
>
>
> Michelle
>
> On 4/4/06, Bernard Li <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > Hi Michelle:
> >
> > I just tested your code on a 2 node cluster (including headnode) and got
> the following result:
> >
> > $ mpirun N a.out
> > Hello World! I am 1 of 2
> > Hello World! I am 0 of 2
> >
> > So it seems fine (you had a space between " and ' mpi.h' but I fixed that)
> >
> > Can you show us the output of "lamnodes" after you have successfully
> booted the nodes? Also, post the output of your LAM/MPI OSCAR tests
> (/home/oscartst/lam/lamtest.out).
> >
> > Cheers,
> >
> > Bernard
> >
> > ________________________________
> From: [EMAIL PROTECTED] on behalf of
> Michelle Chu
> > Sent: Mon 03/04/2006 21:02
> > To: Michael Edwards
> > Cc: [email protected]
> > Subject: Re: [Oscar-users] MPI ranking and size problem
> >
> >
> >
> >
> > Michael,
> > OSCAR version is 4.2. The OS on the client node is Red Hat and is
> installed from the client image file generated during the OSCAR
> installation. The cluster testing step at the end of installation passed
> except the ganglia part. Thank you very much for your help. Michelle
> >
> > Here is the code for Hello.cc.
> >
> *******************************************************************************************
> > #include <iostream.h>
> > // modified to reference the master mpi.h file, to meet the MPI standard
> spec.
> > #include " mpi.h"
> > int
> > main(int argc, char *argv[])
> > {
> > MPI::Init(argc, argv);
> >
> > int rank = MPI::COMM_WORLD.Get_rank();
> > int size = MPI::COMM_WORLD.Get_size();
> >
> > cout << "Hello World! I am " << rank << " of " << size << endl;
> >
> > MPI::Finalize();
> > }
> >
> >
> *****************************************************************************************
> >
> >
> > On 4/3/06, Michael Edwards < [EMAIL PROTECTED] > wrote:
> > > What version of OSCAR are you using, and on what platform?
> > >
> > > Also, could you send us a copy of hello++.cpp, it looks like there are
> > > some errors there? Also, did all the oscar tests pass?
> > >
> > > LAM appears to be working correctly at surface anyway.
> > >
> > >
> > >
> > > On 4/3/06, Michelle Chu < [EMAIL PROTECTED] > wrote:
> > > > Hello, there,
> > > >
> > > > When I mpirun a simple hello MPI program on all my eight nodes as the
> > > > following. I get a sequence of hello world! i am 0 of 1, instead of 1
> of 8,
> > > > 2 of 8, 3 of 8. Also, problem with MPI_INIT. Thank you for your help.
> > > >
> > > > which mpicc
> > > > /opt/lam-7.0.6/bin/mpicc
> > > > lamboot -v my_hostfile
> > > > my_hostfile is:
> > > > **************************************
> > > > athena.cs.xxx.edu
> > > > oscarnode1.cs.xxx.edu
> > > > oscarnode2.cs.xxx.edu
> > > > oscarnode3.cs.xxx.edu
> > > > oscarnode4.cs.xxx.edu
> > > > oscarnode5.cs.xxx.edu
> > > > oscarnode6.cs.xxx.edu
> > > > oscarnode7.cs.xxx.edu
> > > > oscarnode8.cs.xxx.edu
> > > >
> > > >
> *****************************************************************************
> > > > LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
> > > >
> > > > n-1<16365> ssi:boot:base:linear: booting n0 ( athena.cs.xxx.edu )
> > > > n-1<16365> ssi:boot:base:linear: booting n1 ( oscarnode1.cs.xxx.edu)
> > > > n-1<16365> ssi:boot:base:linear: booting n2 ( oscarnode2.cs.xxx.edu)
> > > > n-1<16365> ssi:boot:base:linear: booting n3 ( oscarnode3.cs.xxx.edu)
> > > > n-1<16365> ssi:boot:base:linear: booting n4 ( oscarnode4.cs.xxx.edu)
> > > > n-1<16365> ssi:boot:base:linear: booting n5 ( oscarnode5.cs.xxx.edu )
> > > > n-1<16365> ssi:boot:base:linear: booting n6 ( oscarnode6.cs.xxx.edu)
> > > > n-1<16365> ssi:boot:base:linear: booting n7 ( oscarnode7.cs.xxx.edu)
> > > > n-1<16365> ssi:boot:base:linear: booting n8 ( oscarnode8.cs.xxx.edu)
> > > > n-1<16365> ssi:boot:base:linear: finished
> > > >
> > > > mpirun N hello++
> > > >
> *****************************************************************
> > > > Hello World! I am 0 of 1
> > > > Hello World! I am 0 of 1
> > > > Hello World! I am 0 of 1
> > > > Hello World! I am 0 of 1
> > > > Hello World! I am 0 of 1
> > > > Hello World! I am 0 of 1
> > > >
> -----------------------------------------------------------------------------
> > > > It seems that [at least] one of the processes that was started with
> > > > mpirun did not invoke MPI_INIT before quitting (it is possible that
> > > > more than one process did not invoke MPI_INIT -- mpirun was only
> > > > notified of the first one, which was on node n0).
> > > >
> > > > mpirun can *only* be used with MPI programs (i.e., programs that
> > > > invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
> > > > to run non-MPI programs over the lambooted nodes.
> > > >
> -----------------------------------------------------------------------------
> > > > Hello World! I am 0 of 1
> > > >
> ****************************************************************************************************
> > > >
> > >
> >
> >
>
>
