Re: [OMPI users] Problem with running openMPI program

2009-04-29 Thread Gus Correa

Hi Ankush

You can run the MITgcm ocean model test cases and the CAM3 atmospheric
model test with two processors only, but the codes scale well to
any number of processors.
They are "real life" applications, but not too hard to get to work.
It will take some reading of their README and INSTALL files,
and perhaps of their User Guides to understand how they work, though.

You can even run them on a single processor, but if you want to make
the point that your cluster OpenMPI works, you want also to use more 
than one processor.


You can download the tarballs from these links:

http://mitgcm.org/download/
http://www.ccsm.ucar.edu/models/atm-cam/download/

CAM3 will require the NetCDF package, which is easy to install also:
http://www.unidata.ucar.edu/downloads/netcdf/netcdf-3_6_3/index.jsp

You can even get the NetCDF package with yum, if you prefer.
(Try "yum info netcdf".)

However, the MITgcm can work even without NetCDF (although it can
benefit from NetCDF also).

Of course there are simpler MPI programs out there, but they may be
what you called "mathematical computations" as opposed to "real life 
applications".  :)


Somebody already sent you this link before.
It has some simpler MPI programs:

http://www.pdc.kth.se/training/Tutor/MPI/Templates/index-frame.html

These (online) books may have some MPI program examples:

Ian Foster's (online) book (Ch. 8 is on MPI):

http://www.wotug.org/parallel/books/addison-wesley/dbpp/text/book.html

Peter Pacheco's book (a short version is online):

http://www.cs.usfca.edu/mpi/

Here are other MPI program examples (not all are guaranteed to work):

http://www2.cs.uh.edu/~johnson2/labs.html
http://www.redbooks.ibm.com/redbooks/SG245380.html

See more links to MPI tutorials, etc, here:
http://fats-raid.ldeo.columbia.edu/pages/parallel_programming.html#mpi

I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Ankush Kaul wrote:

@Gus

the applications in the links u have sent are really high level n i 
believe really expensive too as i will have 2 have a physical apparatus 
for various measurements along with the cluster. Am i right?






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-29 Thread Ankush Kaul
@Gus

the applications in the links u have sent are really high level n i believe
really expensive too as i will have 2 have a physical apparatus for various
measurements along with the cluster. Am i right?


Re: [OMPI users] Problem with running openMPI program

2009-04-29 Thread Ankush Kaul
Are there any application that i can implement on a small level, in a lab or
something???

Also what do for clustering web servers?


On Wed, Apr 29, 2009 at 2:46 AM, Gus Correa  wrote:

> Hi Ankush
>
> Glad to hear that your MPI and cluster project were successful.
>
> I don't know if you would call these "mathematical computation"
> or "real life applications" of MPI and clusters, but here are a
> few samples I am familiar with (Earth Science):
>
> Weather forecast:
> http://www.wrf-model.org/index.php
> http://www.mmm.ucar.edu/mm5/
>
> Climate, Atmosphere and Ocean circulation modeling:
> http://www.ccsm.ucar.edu/models/ccsm3.0/
> http://www.jamstec.go.jp/esc/index.en.html
> http://www.metoffice.gov.uk/climatechange/
> http://www.gfdl.noaa.gov/fms
> http://www.nemo-ocean.eu/
>
> Earthquakes, computational seismology, and solid Earth dynamics:
> http://www.gps.caltech.edu/~jtromp/research/index.html
> http://www-esd.lbl.gov/GG/CCS/
>
> A couple of other areas:
>
> Computational Fluid Dynamics, Finite Element Method, etc:
> http://www.foamcfd.org/
> http://www.cimec.org.ar/twiki/bin/view/Cimec/PETScFEM
>
> Computational Chemistry, molecular dynamics, etc:
> http://www.tddft.org/programs/octopus/wiki/index.php/Main_Page
> http://classic.chem.msu.su/gran/gamess/
> http://ambermd.org/
> http://www.gromacs.org/
> http://www.charmm.org/
>
> Gus Correa
>
>
> Ankush Kaul wrote:
>
>> Thanks everyone(esp Gus and Jeff) for the support and guidance. We are
>> almost at the verge of completing our project which could have not been
>> possible without all u guys.
>>
>> I would like to know one more thing, what are real life applications that
>> i can use the cluster for (except mathematical computation)? Can i use if
>> for my web server, if yes then how?
>>
>>
>>
>> On Fri, Apr 24, 2009 at 12:01 AM, Jeff Squyres  jsquy...@cisco.com>> wrote:
>>
>>Excellent answer.  One addendum -- we had a really nice FAQ entry
>>about this kind of stuff on the LAM/MPI web site, which I was
>>horrified to see that we had not copied to the Open MPI site.  So I
>>copied it over this morning.  :-)
>>
>>Have a look at these 3 FAQ (brand new) entries:
>>
>>
>> http://www.open-mpi.org/faq/?category=building#overwrite-pre-installed-ompi
>>   http://www.open-mpi.org/faq/?category=building#where-to-install
>>
>> http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem
>>
>>Hope that helps.
>>
>>
>>
>>
>>On Apr 23, 2009, at 10:34 AM, Gus Correa wrote:
>>
>>Hi Ankush
>>
>>Jeff already sent clarifications about image processing,
>>and the portable API nature of OpenMPI (and other MPI
>>implementations).
>>
>>As for "mpicc: command not found" this is again a problem with your
>>PATH.
>>Remember the "locate" command?  :)
>>Find where mpicc is installed, and put that directory on your PATH.
>>
>>In any case, I would suggest that you choose a central NFS mounted
>>file system on your cluster master node, and install OpenMPI there,
>>configuring and building it from source (not from yum).
>>If this directory is mounted on all nodes, the same OpenMPI will be
>>available on all nodes.
>>This will give you a single standard version of OpenMPI across
>>the board.
>>
>>Clustering can become a very confusing and tricky business if you
>>have heterogeneous nodes, with different OS/Linux versions,
>>different MPI versions etc, software installed in different
>>locations
>>on each node, etc, regardless of whether you use mpiselector or
>>you set the PATH variable on each node, or you use environment
>>modules
>>package, or any other technique to setup your environment.
>>Installing less software, rather than more software,
>>and doing so in a standardized homogeneous way across all
>>cluster nodes,
>>will give you a cleaner environment, which is easier to understand,
>>control, upgrade, and update.
>>
>>A relatively simple way to install a homogeneous cluster is
>>to use the Rocks Clusters "rolls" suite,
>>which is free and based on CentOS.
>>It will probably give you some extra work in the beginning,
>>but may be worthwhile in the long run.
>>See this:
>>http://www.rocksclusters.org/wordpress/
>>
>>
>>My two cents.
>>
>>Gus Correa
>>
>>  -
>>Gustavo Correa
>>Lamont-Doherty Earth Observatory - Columbia University
>>Palisades, NY, 10964-8000 - USA
>>
>>  -
>>
>>Ankush Kaul wrote:
>> > @Gus, Eugene
>> > I read all you mails and even 

Re: [OMPI users] Problem with running openMPI program

2009-04-28 Thread Gus Correa

Hi Ankush

Glad to hear that your MPI and cluster project were successful.

I don't know if you would call these "mathematical computation"
or "real life applications" of MPI and clusters, but here are a
few samples I am familiar with (Earth Science):

Weather forecast:
http://www.wrf-model.org/index.php
http://www.mmm.ucar.edu/mm5/

Climate, Atmosphere and Ocean circulation modeling:
http://www.ccsm.ucar.edu/models/ccsm3.0/
http://www.jamstec.go.jp/esc/index.en.html
http://www.metoffice.gov.uk/climatechange/
http://www.gfdl.noaa.gov/fms
http://www.nemo-ocean.eu/

Earthquakes, computational seismology, and solid Earth dynamics:
http://www.gps.caltech.edu/~jtromp/research/index.html
http://www-esd.lbl.gov/GG/CCS/

A couple of other areas:

Computational Fluid Dynamics, Finite Element Method, etc:
http://www.foamcfd.org/
http://www.cimec.org.ar/twiki/bin/view/Cimec/PETScFEM

Computational Chemistry, molecular dynamics, etc:
http://www.tddft.org/programs/octopus/wiki/index.php/Main_Page
http://classic.chem.msu.su/gran/gamess/
http://ambermd.org/
http://www.gromacs.org/
http://www.charmm.org/

Gus Correa


Ankush Kaul wrote:
Thanks everyone(esp Gus and Jeff) for the support and guidance. We are 
almost at the verge of completing our project which could have not been 
possible without all u guys.


I would like to know one more thing, what are real life applications 
that i can use the cluster for (except mathematical computation)? Can i 
use if for my web server, if yes then how?




On Fri, Apr 24, 2009 at 12:01 AM, Jeff Squyres > wrote:


Excellent answer.  One addendum -- we had a really nice FAQ entry
about this kind of stuff on the LAM/MPI web site, which I was
horrified to see that we had not copied to the Open MPI site.  So I
copied it over this morning.  :-)

Have a look at these 3 FAQ (brand new) entries:

 
 http://www.open-mpi.org/faq/?category=building#overwrite-pre-installed-ompi

   http://www.open-mpi.org/faq/?category=building#where-to-install
 
 http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem


Hope that helps.




On Apr 23, 2009, at 10:34 AM, Gus Correa wrote:

Hi Ankush

Jeff already sent clarifications about image processing,
and the portable API nature of OpenMPI (and other MPI
implementations).

As for "mpicc: command not found" this is again a problem with your
PATH.
Remember the "locate" command?  :)
Find where mpicc is installed, and put that directory on your PATH.

In any case, I would suggest that you choose a central NFS mounted
file system on your cluster master node, and install OpenMPI there,
configuring and building it from source (not from yum).
If this directory is mounted on all nodes, the same OpenMPI will be
available on all nodes.
This will give you a single standard version of OpenMPI across
the board.

Clustering can become a very confusing and tricky business if you
have heterogeneous nodes, with different OS/Linux versions,
different MPI versions etc, software installed in different
locations
on each node, etc, regardless of whether you use mpiselector or
you set the PATH variable on each node, or you use environment
modules
package, or any other technique to setup your environment.
Installing less software, rather than more software,
and doing so in a standardized homogeneous way across all
cluster nodes,
will give you a cleaner environment, which is easier to understand,
control, upgrade, and update.

A relatively simple way to install a homogeneous cluster is
to use the Rocks Clusters "rolls" suite,
which is free and based on CentOS.
It will probably give you some extra work in the beginning,
but may be worthwhile in the long run.
See this:
http://www.rocksclusters.org/wordpress/


My two cents.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Ankush Kaul wrote:
 > @Gus, Eugene
 > I read all you mails and even followed the same procedure, it
was blas
 > that was giving the problem.
 >
 > Thanks
 >
 > I am again stuck on a problem, i connected a new node to my
cluster and
 > installed CentOS 5.2 on it. after that i use yum to install
 > openmpi,openmpi-libs and openmpi-devel sucessfully.
 >
 > But still when i run mpicc command it gives me error :
 > /bash: mpicc: command not found/
 

Re: [OMPI users] Problem with running openMPI program

2009-04-28 Thread Jeff Squyres

On Apr 28, 2009, at 1:29 PM, Ankush Kaul wrote:

I would like to know one more thing, what are real life applications  
that i can use the cluster for (except mathematical computation)?  
Can i use if for my web server, if yes then how?



Not really.  MPI is just about message passing -- it's frequently used  
for parallel computations, but it's main purpose in life is message  
passing.  Hence, applications have to explicitly be written to use  
MPI's API.  Apache doesn't utilize MPI for communication; there are  
other features for clustering web servers, etc.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Jeff Squyres
Excellent answer.  One addendum -- we had a really nice FAQ entry  
about this kind of stuff on the LAM/MPI web site, which I was  
horrified to see that we had not copied to the Open MPI site.  So I  
copied it over this morning.  :-)


Have a look at these 3 FAQ (brand new) entries:

http://www.open-mpi.org/faq/?category=building#overwrite-pre-installed-ompi
http://www.open-mpi.org/faq/?category=building#where-to-install
http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem

Hope that helps.



On Apr 23, 2009, at 10:34 AM, Gus Correa wrote:


Hi Ankush

Jeff already sent clarifications about image processing,
and the portable API nature of OpenMPI (and other MPI  
implementations).


As for "mpicc: command not found" this is again a problem with your
PATH.
Remember the "locate" command?  :)
Find where mpicc is installed, and put that directory on your PATH.

In any case, I would suggest that you choose a central NFS mounted
file system on your cluster master node, and install OpenMPI there,
configuring and building it from source (not from yum).
If this directory is mounted on all nodes, the same OpenMPI will be
available on all nodes.
This will give you a single standard version of OpenMPI across the  
board.


Clustering can become a very confusing and tricky business if you
have heterogeneous nodes, with different OS/Linux versions,
different MPI versions etc, software installed in different locations
on each node, etc, regardless of whether you use mpiselector or
you set the PATH variable on each node, or you use environment modules
package, or any other technique to setup your environment.
Installing less software, rather than more software,
and doing so in a standardized homogeneous way across all cluster  
nodes,

will give you a cleaner environment, which is easier to understand,
control, upgrade, and update.

A relatively simple way to install a homogeneous cluster is
to use the Rocks Clusters "rolls" suite,
which is free and based on CentOS.
It will probably give you some extra work in the beginning,
but may be worthwhile in the long run.
See this:
http://www.rocksclusters.org/wordpress/


My two cents.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Ankush Kaul wrote:
> @Gus, Eugene
> I read all you mails and even followed the same procedure, it was  
blas

> that was giving the problem.
>
> Thanks
>
> I am again stuck on a problem, i connected a new node to my  
cluster and

> installed CentOS 5.2 on it. after that i use yum to install
> openmpi,openmpi-libs and openmpi-devel sucessfully.
>
> But still when i run mpicc command it gives me error :
> /bash: mpicc: command not found/
>
> i found out there is a command *mpi-selector* but dont know hoe to  
use it.

> Is this a new version of openmpi? how do i configure it?
>
>
>  


>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Gus Correa

Hi Ankush

Jeff already sent clarifications about image processing,
and the portable API nature of OpenMPI (and other MPI implementations).

As for "mpicc: command not found" this is again a problem with your
PATH.
Remember the "locate" command?  :)
Find where mpicc is installed, and put that directory on your PATH.

In any case, I would suggest that you choose a central NFS mounted
file system on your cluster master node, and install OpenMPI there,
configuring and building it from source (not from yum).
If this directory is mounted on all nodes, the same OpenMPI will be
available on all nodes.
This will give you a single standard version of OpenMPI across the board.

Clustering can become a very confusing and tricky business if you
have heterogeneous nodes, with different OS/Linux versions,
different MPI versions etc, software installed in different locations
on each node, etc, regardless of whether you use mpiselector or
you set the PATH variable on each node, or you use environment modules
package, or any other technique to setup your environment.
Installing less software, rather than more software,
and doing so in a standardized homogeneous way across all cluster nodes,
will give you a cleaner environment, which is easier to understand,
control, upgrade, and update.

A relatively simple way to install a homogeneous cluster is
to use the Rocks Clusters "rolls" suite,
which is free and based on CentOS.
It will probably give you some extra work in the beginning,
but may be worthwhile in the long run.
See this:
http://www.rocksclusters.org/wordpress/


My two cents.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Ankush Kaul wrote:

@Gus, Eugene
I read all you mails and even followed the same procedure, it was blas 
that was giving the problem.


Thanks

I am again stuck on a problem, i connected a new node to my cluster and 
installed CentOS 5.2 on it. after that i use yum to install 
openmpi,openmpi-libs and openmpi-devel sucessfully.


But still when i run mpicc command it gives me error :
/bash: mpicc: command not found/

i found out there is a command *mpi-selector* but dont know hoe to use it.
Is this a new version of openmpi? how do i configure it?




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Jeff Squyres
MPI is an API specification, meaning that correctly-written MPI  
applications are source-code compatible across all MPI  
implementations.  Hence, these tutorial examples will compile and run  
with Open MPI just as well as they will with LAM/MPI (I actually  
helped write those tutorial examples several years ago :-) ).


You just need to be sure to recompile/re-link each MPI application  
with the MPI implementation that you want to use.  For example, you  
can't compile an MPI application with LAM and use Open MPI's "mpirun"  
to run it -- you need to compile with Open MPI and run with Open MPI's  
mpirun.


Make sense?


On Apr 23, 2009, at 8:04 AM, Ankush Kaul wrote:

i have gone through that course, but i still am not at a stage where  
i can develop a MPI program, so was looking for some IP programs on  
the net.


Will try the imageproc.c program which i found 
http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/

hope it runs on openmpi.



On Thu, Apr 23, 2009 at 5:07 PM, Jeff Squyres   
wrote:
Yes, they will run.  Note that these are toy image processing  
examples; they are no substitute for a real image processing  
application.


You might want to look at a full MPI tutorial to get an  
understanding of MPI itself:


 http://ci-tutor.ncsa.uiuc.edu/login.php

Register (it's free), login, and look for the Introduction to MPI  
tutorial.  It's quite good.





On Apr 23, 2009, at 6:59 AM, Ankush Kaul wrote:

I found some programs on this link : 
http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/

will these program run on my openmpi cluster?

actually i want to run some image processing program on my cluster,  
as i cannot write the entire code of the program can anyone tell  
where can i get ip programs.


I know this is the wrong place to ask but thought would give it a  
try as i cannot find anything on the net.



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Ankush Kaul
i have gone through that course, but i still am not at a stage where i can
develop a MPI program, so was looking for some IP programs on the net.

Will try the 
imageproc.cprogram
which i found
http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/

hope it runs on openmpi.



On Thu, Apr 23, 2009 at 5:07 PM, Jeff Squyres  wrote:

> Yes, they will run.  Note that these are toy image processing examples;
> they are no substitute for a real image processing application.
>
> You might want to look at a full MPI tutorial to get an understanding of
> MPI itself:
>
>  http://ci-tutor.ncsa.uiuc.edu/login.php
>
> Register (it's free), login, and look for the Introduction to MPI tutorial.
>  It's quite good.
>
>
>
>
> On Apr 23, 2009, at 6:59 AM, Ankush Kaul wrote:
>
>  I found some programs on this link :
>> http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/
>>
>> will these program run on my openmpi cluster?
>>
>> actually i want to run some image processing program on my cluster, as i
>> cannot write the entire code of the program can anyone tell where can i get
>> ip programs.
>>
>> I know this is the wrong place to ask but thought would give it a try as i
>> cannot find anything on the net.
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Jeff Squyres
Yes, they will run.  Note that these are toy image processing  
examples; they are no substitute for a real image processing  
application.


You might want to look at a full MPI tutorial to get an understanding  
of MPI itself:


  http://ci-tutor.ncsa.uiuc.edu/login.php

Register (it's free), login, and look for the Introduction to MPI  
tutorial.  It's quite good.




On Apr 23, 2009, at 6:59 AM, Ankush Kaul wrote:


I found some programs on this link : 
http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/

will these program run on my openmpi cluster?

actually i want to run some image processing program on my cluster,  
as i cannot write the entire code of the program can anyone tell  
where can i get ip programs.


I know this is the wrong place to ask but thought would give it a  
try as i cannot find anything on the net.



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Jeff Squyres

On Apr 23, 2009, at 3:40 AM, Ankush Kaul wrote:

I am again stuck on a problem, i connected a new node to my cluster  
and installed CentOS 5.2 on it. after that i use yum to install  
openmpi,openmpi-libs and openmpi-devel sucessfully.


Be sure that you have the same version of Open MPI installed on all  
your nodes.



But still when i run mpicc command it gives me error :
bash: mpicc: command not found

i found out there is a command mpi-selector but dont know hoe to use  
it.

Is this a new version of openmpi? how do i configure it?



No it's not a version of Open MPI, it's a mechanism for switching  
between multiple different MPI implementations installed on the same  
machine.


See the man page or "mpi-selector --help" for details.

--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Ankush Kaul
I found some programs on this link :
http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/

will these program run on my openmpi cluster?

actually i want to run some image processing program on my cluster, as i
cannot write the entire code of the program can anyone tell where can i get
ip programs.

I know this is the wrong place to ask but thought would give it a try as i
cannot find anything on the net.


Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Ankush Kaul
@Gus, Eugene
I read all you mails and even followed the same procedure, it was blas that
was giving the problem.

Thanks

I am again stuck on a problem, i connected a new node to my cluster and
installed CentOS 5.2 on it. after that i use yum to install
openmpi,openmpi-libs and openmpi-devel sucessfully.

But still when i run mpicc command it gives me error :
*bash: mpicc: command not found*

i found out there is a command *mpi-selector* but dont know hoe to use it.
Is this a new version of openmpi? how do i configure it?


Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Gus Correa

Hi

Do "yum list | grep mpi" to find the correct package names.
Then uninstall them with "yum remove" using the correct package name.

Don't use yum to install different flavors of MPI.
Things like mpicc, mpirun, MPI libraries, man pages, etc,
will get overwritten in /usr or /usr/local.
If you want to use yum, install only one MPI flavor.

OR (after removing the old packages with yum):

Reinstall OpenMPI from source.
Use the --prefix-/your/target/OpenMPI/dir during configure.

Reinstall MPICH2 from source.
Use the --prefix-/your/target/MPICH2/dir during configure.

Use different directories for OpenMPI and MPICH2.
Or install only one MPI flavor.

I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Ankush Kaul wrote:
i feel the above problem occured due 2 installing mpich package, now 
even nomal mpi programs are not running.
What should we do? we even tried *yum remove mpich* but it says no 
packages to remove.

Please Help!!!

On Wed, Apr 22, 2009 at 11:34 AM, Ankush Kaul > wrote:


We are facing another problem, we were tryin to install different
benchmarking packages

now whenever we try to run *mpirun* command (which was working
perfectly before) we get this error:
/usr/local/bin/mpdroot: open failed for root's mpd conf filempdtrace
(__init__ 1190): forked process failed; status=255/

whats the problem here?




On Tue, Apr 21, 2009 at 11:45 PM, Gus Correa > wrote:

Hi Ankush

Ankush Kaul wrote:

@Eugene
they are ok but we wanted something better, which would more
clearly show de diff in using a single pc and the cluster.

@Prakash
i had prob with running de programs as they were compiling
using mpcc n not mpicc

@gus
we are tryin 2 figure out de hpl config, its quite complicated,


I sent you some sketchy instructions to build HPL,
on my last message to this thread.
I built HPL and run it here yesterday that way.
Did you try my suggestions?
Where did you get stuck?


also de locate command lists lots of confusing results.


I would say the list is just long, not really confusing.
You can  find what you need if you want.
Pipe the output of locate through "more", and search carefully.
If you are talking about BLAS try "locate libblas.a" and
"locate libgoto.a".
Those are the libraries you need, and if they are not there
you need to install one of them.
Read my previous email for details.
I hope it will help you get HPL working, if you are interested
on HPL.


I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

@jeff
i think u are correct we may have installed openmpi without
VT support, but is there anythin we can do now???

One more thing I found this program but dont know how to run
it : http://www.cis.udel.edu/~pollock/367/manual/node35.html

Thanks 2 all u guys 4 putting in so much efforts to help us out.







___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Gus Correa

Hi

This is a MPICH2 error, not OpenMPI.
I saw you sent the same message to the MPICH list.
It looks like you are mixed both MPI flavors.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Ankush Kaul wrote:
We are facing another problem, we were tryin to install different 
benchmarking packages


now whenever we try to run *mpirun* command (which was working perfectly 
before) we get this error:
/usr/local/bin/mpdroot: open failed for root's mpd conf filempdtrace 
(__init__ 1190): forked process failed; status=255/


whats the problem here?




On Tue, Apr 21, 2009 at 11:45 PM, Gus Correa > wrote:


Hi Ankush

Ankush Kaul wrote:

@Eugene
they are ok but we wanted something better, which would more
clearly show de diff in using a single pc and the cluster.

@Prakash
i had prob with running de programs as they were compiling using
mpcc n not mpicc

@gus
we are tryin 2 figure out de hpl config, its quite complicated,


I sent you some sketchy instructions to build HPL,
on my last message to this thread.
I built HPL and run it here yesterday that way.
Did you try my suggestions?
Where did you get stuck?


also de locate command lists lots of confusing results.


I would say the list is just long, not really confusing.
You can  find what you need if you want.
Pipe the output of locate through "more", and search carefully.
If you are talking about BLAS try "locate libblas.a" and
"locate libgoto.a".
Those are the libraries you need, and if they are not there
you need to install one of them.
Read my previous email for details.
I hope it will help you get HPL working, if you are interested on HPL.


I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

@jeff
i think u are correct we may have installed openmpi without VT
support, but is there anythin we can do now???

One more thing I found this program but dont know how to run it
: http://www.cis.udel.edu/~pollock/367/manual/node35.html

Thanks 2 all u guys 4 putting in so much efforts to help us out.






___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Gus Correa

Hi Ankush

I second Eugene's comments.

I already sent you on previous emails to this thread
all relevant information on where to
get HPL from Netlib (http://netlib.org/benchmark/hpl/),
Goto BLAS from TACC (http://www.tacc.utexas.edu/resources/software/),
and the standard BLAS from Netlib (http://www.netlib.org/blas/).
OK, there go the links once again!

I also sent you instructions on how to install HPL,
exactly as I installed it here two days ago.
Did you read those instructions?
It is one of the messages on this thread.
Check the mailing list archive, if you don't have that email anymore.

Eugene just sent you a gotcha on how to build Netlib BLAS.
Goto BLAS is also easy to install with its "quickbuild" scripts,
and it comes with Readme, QuickInstall, and FAQ files, which you should 
read.


However, somehow you are repeating the same questions
that I and others already answered.
There isn't much more I can say to help you out.

Good luck!

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Eugene Loh wrote:

Ankush Kaul wrote:


@gus
we are not able to make hpl sucessfully.
 
i think it has to do something with blas
 
i cannot find blas tar file on the net, i found rpm but de 
installation steps is with tar file.


First of all, this mail list is for Open MPI issues.  On this list are 
people who are helpful and know about lots of stuff (including things 
having anything at all to do with MPI), but HPL and HPCC have their own 
support mechanisms and you should probably pursue those for HPL questions.


Anyhow, if I google "blas", I immediately come up with netlib.org, which 
is where you can get a BLAS source tar file.  I've had to go through the 
HPL experience myself in the last 0-2 days, and it seems to me that the 
netlib.org site is not responding.  So, one can google "netlib mirror" 
to find mirror sites, poke around a little, and end up getting BLAS from 
the Sandia mirror site.


Short version:  try http://netlib.sandia.gov/blas/blas.tgz

I found a gotcha.  I changed the "g77" in the BLAS/make.inc file to 
become mpif77.  Also, in the HPL hpl/Make.$ARCH file, I used mpif77 for 
the linker.  This way, some Fortran I/O routines used by blas (xerbla.f) 
will be found at link time.  (I was using HPL from HPCC.  Not sure if 
your HPL is the same.)

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Eugene Loh

Ankush Kaul wrote:


@gus
we are not able to make hpl sucessfully.
 
i think it has to do something with blas
 
i cannot find blas tar file on the net, i found rpm but de 
installation steps is with tar file.


First of all, this mail list is for Open MPI issues.  On this list are 
people who are helpful and know about lots of stuff (including things 
having anything at all to do with MPI), but HPL and HPCC have their own 
support mechanisms and you should probably pursue those for HPL questions.


Anyhow, if I google "blas", I immediately come up with netlib.org, which 
is where you can get a BLAS source tar file.  I've had to go through the 
HPL experience myself in the last 0-2 days, and it seems to me that the 
netlib.org site is not responding.  So, one can google "netlib mirror" 
to find mirror sites, poke around a little, and end up getting BLAS from 
the Sandia mirror site.


Short version:  try http://netlib.sandia.gov/blas/blas.tgz

I found a gotcha.  I changed the "g77" in the BLAS/make.inc file to 
become mpif77.  Also, in the HPL hpl/Make.$ARCH file, I used mpif77 for 
the linker.  This way, some Fortran I/O routines used by blas (xerbla.f) 
will be found at link time.  (I was using HPL from HPCC.  Not sure if 
your HPL is the same.)


Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Ankush Kaul
@gus
we are not able to make hpl sucessfully.

i think it has to do something with blas

i cannot find blas tar file on the net, i found rpm but de installation
steps is with tar file.

#*locate blas* gave us the following result

*[root@ccomp1 hpl]# locate blas
/hpl/include/hpl_blas.h
/hpl/makes/Make.blas
/hpl/src/blas
/hpl/src/blas/HPL_daxpy.c
/hpl/src/blas/HPL_dcopy.c
/hpl/src/blas/HPL_dgemm.c
/hpl/src/blas/HPL_dgemv.c
/hpl/src/blas/HPL_dger.c
/hpl/src/blas/HPL_dscal.c
/hpl/src/blas/HPL_dswap.c
/hpl/src/blas/HPL_dtrsm.c
/hpl/src/blas/HPL_dtrsv.c
/hpl/src/blas/HPL_idamax.c
/hpl/src/blas/ccomp
/hpl/src/blas/i386
/hpl/src/blas/ccomp/Make.inc
/hpl/src/blas/ccomp/Makefile
/hpl/src/blas/i386/Make.inc
/hpl/src/blas/i386/Makefile
/usr/include/boost/numeric/ublas
/usr/include/boost/numeric/ublas/banded.hpp
/usr/include/boost/numeric/ublas/blas.hpp
/usr/include/boost/numeric/ublas/detail
/usr/include/boost/numeric/ublas/exception.hpp
/usr/include/boost/numeric/ublas/expression_types.hpp
/usr/include/boost/numeric/ublas/functional.hpp
/usr/include/boost/numeric/ublas/fwd.hpp
/usr/include/boost/numeric/ublas/hermitian.hpp
/usr/include/boost/numeric/ublas/io.hpp
/usr/include/boost/numeric/ublas/lu.hpp
/usr/include/boost/numeric/ublas/matrix.hpp
/usr/include/boost/numeric/ublas/matrix_expression.hpp
/usr/include/boost/numeric/ublas/matrix_proxy.hpp
/usr/include/boost/numeric/ublas/matrix_sparse.hpp
/usr/include/boost/numeric/ublas/operation.hpp
/usr/include/boost/numeric/ublas/operation_blocked.hpp
/usr/include/boost/numeric/ublas/operation_sparse.hpp
/usr/include/boost/numeric/ublas/storage.hpp
/usr/include/boost/numeric/ublas/storage_sparse.hpp
/usr/include/boost/numeric/ublas/symmetric.hpp
/usr/include/boost/numeric/ublas/traits.hpp
/usr/include/boost/numeric/ublas/triangular.hpp
/usr/include/boost/numeric/ublas/vector.hpp
/usr/include/boost/numeric/ublas/vector_expression.hpp
/usr/include/boost/numeric/ublas/vector_of_vector.hpp
/usr/include/boost/numeric/ublas/vector_proxy.hpp
/usr/include/boost/numeric/ublas/vector_sparse.hpp
/usr/include/boost/numeric/ublas/detail/concepts.hpp
/usr/include/boost/numeric/ublas/detail/config.hpp
/usr/include/boost/numeric/ublas/detail/definitions.hpp
/usr/include/boost/numeric/ublas/detail/documentation.hpp
/usr/include/boost/numeric/ublas/detail/duff.hpp
/usr/include/boost/numeric/ublas/detail/iterator.hpp
/usr/include/boost/numeric/ublas/detail/matrix_assign.hpp
/usr/include/boost/numeric/ublas/detail/raw.hpp
/usr/include/boost/numeric/ublas/detail/returntype_deduction.hpp
/usr/include/boost/numeric/ublas/detail/temporary.hpp
/usr/include/boost/numeric/ublas/detail/vector_assign.hpp
/usr/lib/libblas.so.3
/usr/lib/libblas.so.3.1
/usr/lib/libblas.so.3.1.1
/usr/lib/openoffice.org/basis3.0/share/gallery/htmlexpo/cublast.gif
/usr/lib/openoffice.org/basis3.0/share/gallery/htmlexpo/cublast_.gif
/usr/share/backgrounds/images/tiny_blast_of_red.jpg
/usr/share/doc/blas-3.1.1
/usr/share/doc/blas-3.1.1/blasqr.ps
/usr/share/man/manl/intro_blas1.l.gz*

When we try to make using the following command
*# make arch=ccomp*
**
it gives error :
*Makefile:47: Make.inc: No such file or directory
make[2]: *** No rule to make target `Make.inc'.  Stop.
make[2]: Leaving directory `/hpl/src/auxil/ccomp'
make[1]: *** [build_src] Error 2
make[1]: Leaving directory `/hpl'
make: *** [build] Error 2*
**
*ccomp* folder is created but *xhpl* file is not created
is it some prob with de config file?




On Wed, Apr 22, 2009 at 11:40 AM, Ankush Kaul wrote:

> i feel the above problem occured due 2 installing mpich package, now even
> nomal mpi programs are not running.
> What should we do? we even tried *yum remove mpich* but it says no
> packages to remove.
> Please Help!!!
>
>   On Wed, Apr 22, 2009 at 11:34 AM, Ankush Kaul wrote:
>
>> We are facing another problem, we were tryin to install different
>> benchmarking packages
>>
>> now whenever we try to run *mpirun* command (which was working perfectly
>> before) we get this error:
>> *usr/local/bin/mpdroot: open failed for root's mpd conf filempdtrace
>> (__init__ 1190): forked process failed; status=255*
>>
>> whats the problem here?
>>
>>
>>
>> On Tue, Apr 21, 2009 at 11:45 PM, Gus Correa wrote:
>>
>>> Hi Ankush
>>>
>>> Ankush Kaul wrote:
>>>
 @Eugene
 they are ok but we wanted something better, which would more clearly
 show de diff in using a single pc and the cluster.

 @Prakash
 i had prob with running de programs as they were compiling using mpcc n
 not mpicc

 @gus
 we are tryin 2 figure out de hpl config, its quite complicated,

>>>
>>> I sent you some sketchy instructions to build HPL,
>>> on my last message to this thread.
>>> I built HPL and run it here yesterday that way.
>>> Did you try my suggestions?
>>> Where did you get stuck?
>>>
>>> also de locate command lists lots of confusing results.


>>> I would say the list is 

Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Ankush Kaul
i feel the above problem occured due 2 installing mpich package, now even
nomal mpi programs are not running.
What should we do? we even tried *yum remove mpich* but it says no packages
to remove.
Please Help!!!

On Wed, Apr 22, 2009 at 11:34 AM, Ankush Kaul wrote:

> We are facing another problem, we were tryin to install different
> benchmarking packages
>
> now whenever we try to run *mpirun* command (which was working perfectly
> before) we get this error:
> *usr/local/bin/mpdroot: open failed for root's mpd conf filempdtrace
> (__init__ 1190): forked process failed; status=255*
>
> whats the problem here?
>
>
>
> On Tue, Apr 21, 2009 at 11:45 PM, Gus Correa wrote:
>
>> Hi Ankush
>>
>> Ankush Kaul wrote:
>>
>>> @Eugene
>>> they are ok but we wanted something better, which would more clearly show
>>> de diff in using a single pc and the cluster.
>>>
>>> @Prakash
>>> i had prob with running de programs as they were compiling using mpcc n
>>> not mpicc
>>>
>>> @gus
>>> we are tryin 2 figure out de hpl config, its quite complicated,
>>>
>>
>> I sent you some sketchy instructions to build HPL,
>> on my last message to this thread.
>> I built HPL and run it here yesterday that way.
>> Did you try my suggestions?
>> Where did you get stuck?
>>
>> also de locate command lists lots of confusing results.
>>>
>>>
>> I would say the list is just long, not really confusing.
>> You can  find what you need if you want.
>> Pipe the output of locate through "more", and search carefully.
>> If you are talking about BLAS try "locate libblas.a" and
>> "locate libgoto.a".
>> Those are the libraries you need, and if they are not there
>> you need to install one of them.
>> Read my previous email for details.
>> I hope it will help you get HPL working, if you are interested on HPL.
>>
>> I hope this helps.
>>
>> Gus Correa
>> -
>> Gustavo Correa
>> Lamont-Doherty Earth Observatory - Columbia University
>> Palisades, NY, 10964-8000 - USA
>> -
>>
>>  @jeff
>>> i think u are correct we may have installed openmpi without VT support,
>>> but is there anythin we can do now???
>>>
>>> One more thing I found this program but dont know how to run it :
>>> http://www.cis.udel.edu/~pollock/367/manual/node35.html
>>>
>>> Thanks 2 all u guys 4 putting in so much efforts to help us out.
>>>
>>>
>>> 
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>


Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Ankush Kaul
We are facing another problem, we were tryin to install different
benchmarking packages

now whenever we try to run *mpirun* command (which was working perfectly
before) we get this error:
*usr/local/bin/mpdroot: open failed for root's mpd conf filempdtrace
(__init__ 1190): forked process failed; status=255*

whats the problem here?



On Tue, Apr 21, 2009 at 11:45 PM, Gus Correa  wrote:

> Hi Ankush
>
> Ankush Kaul wrote:
>
>> @Eugene
>> they are ok but we wanted something better, which would more clearly show
>> de diff in using a single pc and the cluster.
>>
>> @Prakash
>> i had prob with running de programs as they were compiling using mpcc n
>> not mpicc
>>
>> @gus
>> we are tryin 2 figure out de hpl config, its quite complicated,
>>
>
> I sent you some sketchy instructions to build HPL,
> on my last message to this thread.
> I built HPL and run it here yesterday that way.
> Did you try my suggestions?
> Where did you get stuck?
>
> also de locate command lists lots of confusing results.
>>
>>
> I would say the list is just long, not really confusing.
> You can  find what you need if you want.
> Pipe the output of locate through "more", and search carefully.
> If you are talking about BLAS try "locate libblas.a" and
> "locate libgoto.a".
> Those are the libraries you need, and if they are not there
> you need to install one of them.
> Read my previous email for details.
> I hope it will help you get HPL working, if you are interested on HPL.
>
> I hope this helps.
>
> Gus Correa
> -
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> -
>
>  @jeff
>> i think u are correct we may have installed openmpi without VT support,
>> but is there anythin we can do now???
>>
>> One more thing I found this program but dont know how to run it :
>> http://www.cis.udel.edu/~pollock/367/manual/node35.html
>>
>> Thanks 2 all u guys 4 putting in so much efforts to help us out.
>>
>>
>> 
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-21 Thread Gus Correa

Hi Ankush

Ankush Kaul wrote:

@Eugene
they are ok but we wanted something better, which would more clearly 
show de diff in using a single pc and the cluster.


@Prakash
i had prob with running de programs as they were compiling using mpcc n 
not mpicc


@gus
we are tryin 2 figure out de hpl config, its quite complicated, 


I sent you some sketchy instructions to build HPL,
on my last message to this thread.
I built HPL and run it here yesterday that way.
Did you try my suggestions?
Where did you get stuck?

also de 
locate command lists lots of confusing results.




I would say the list is just long, not really confusing.
You can  find what you need if you want.
Pipe the output of locate through "more", and search carefully.
If you are talking about BLAS try "locate libblas.a" and
"locate libgoto.a".
Those are the libraries you need, and if they are not there
you need to install one of them.
Read my previous email for details.
I hope it will help you get HPL working, if you are interested on HPL.

I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


@jeff
i think u are correct we may have installed openmpi without VT support, 
but is there anythin we can do now???


One more thing I found this program but dont know how to run it : 
http://www.cis.udel.edu/~pollock/367/manual/node35.html


Thanks 2 all u guys 4 putting in so much efforts to help us out.




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-21 Thread Eugene Loh




Ankush Kaul wrote:
@Eugene
they are ok but we wanted something better, which would more clearly
show de diff in using a single pc and the cluster.
  
Another option is the NAS Parallel Benchmarks.  They are older, but
well known, self-verifying, report performance, and relatively small
and accessible.
@Prakash
  i had prob with running de
programs as they were compiling using mpcc n not mpicc
  
@gus
we are tryin 2 figure out de hpl config, its quite complicated, also de
locate command lists lots of confusing results.
  
@jeff
i think u are correct we may have installed openmpi without VT support,
but is there anythin we can do now???

Reinstall OMPI?
One more thing I found this program but dont know how to
run it : http://www.cis.udel.edu/~pollock/367/manual/node35.html

That may depend on more than just MPI.  You need some graphics.  You
might need the MPICH MPE environment.

If I understand where you're at on this, you might also try writing
your own MPI programs.  Run something simple.  Then something a little
more complicated.  And so on.  Build something bit by bit.  Good luck.




Re: [OMPI users] Problem with running openMPI program

2009-04-21 Thread Ankush Kaul
@Eugene
they are ok but we wanted something better, which would more clearly show de
diff in using a single pc and the cluster.

@Prakash
i had prob with running de programs as they were compiling using mpcc n not
mpicc

@gus
we are tryin 2 figure out de hpl config, its quite complicated, also de
locate command lists lots of confusing results.

@jeff
i think u are correct we may have installed openmpi without VT support, but
is there anythin we can do now???

One more thing I found this program but dont know how to run it :
http://www.cis.udel.edu/~pollock/367/manual/node35.html

Thanks 2 all u guys 4 putting in so much efforts to help us out.


Re: [OMPI users] Problem with running openMPI program

2009-04-21 Thread Jeff Squyres

On Apr 20, 2009, at 11:08 AM, Ankush Kaul wrote:


i try to run mpicc-vt -c hello.c -o hello

but it gives a error
bash: mpicc-vt: command not found



It sounds like your Open MPI installation was not built with  
VampirTrace support.  Note that OMPI only included VT in Open MPI v1.3  
and later.  When Open MPI is installed with VT support, mpicc-vt  
should be in $prefix/bin.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Gus Correa

Hi Amjad, list

HPL has some quirks to install, as I just found out.
It can be done, though.
I had used a precompiled version of HPL on my Rocks cluster before,
but that version is no longer being distributed, unfortunately.

Go to the HPL "setup" directory,
and run the script "make_generic".
This will give you a Make. template file named Make.UNKNOWN.
You can rename this file "Make.whatever_arch_you_want",
copy it to the HPL top directory,
and edit it,
adjusting the important variable definitions to your system.

For instance, where it says:
CC   = mpicc
replace by:
CC   = /full/path/to/OpenMPI/bin/mpicc
and so on for ARCH, TOPdir, etc.
Some 4-6 variables only need to be changed.

These threads show two examples:

http://marc.info/?l=npaci-rocks-discussion=123264688212088=2
http://marc.info/?l=npaci-rocks-discussion=123163114922058=2

You will need also a BLAS (basic linear algebra subprograms) library.
You may have one already on your computer.
Do "locate libblas" and "locate libgoto" to search for it.

If you don't have BLAS, you can download the Goto BLAS library
and install it, which is what I did:

http://www.tacc.utexas.edu/resources/software/

The Goto BLAS is probably the fastest version of BLAS.
However, you can try also the more traditional BLAS from Netlib:

http://www.netlib.org/blas/

I found it easier to work with gcc and gfortran (i.e. both BLAS
and OpenMPI compiled with gcc and gfortran), than to use PGI or Intel
compilers.  However, I didn't try hard with PGI and Intel.

Read the HPL TUNNING file to learn how to change/adjust
the HPL.dat parameters.
The PxQ value gives you the number of processes for mpiexec.

***

The goal of benchmarking is to measure performance under heavy use
(on a parallel computer using MPI, in the HPL case).
However, other than performance measurements,
benchmark programs in general don't produce additional results.
For instance, HPL does LU factorization of matrices and solves
linear systems with an efficient parallel algorithm.
This by itself is great, and is one reason why it is the
Top500 benchmark:
http://en.wikipedia.org/wiki/TOP500 and 
http://www.top500.org/project/linpack .


However, within HPL the LU decomposition and the
linear system solution are not applied to any particular
concrete problem.
Only the time it takes to run each part of HPL really matters.
The matrices are made up of random numbers, if I remember right,
are totally synthetic, and don't mean anything physical.
Of course LU factorization has tons of applications, but the goal
of HPL is not to explore applications, it is just to measure performance
during the number crunching linear algebra operations using MPI.

HPL will make the case that your cluster is working,
and you can tell your professors that it works with
a performance that you can measure, some X Gflops (see the xhpl output).

However, if you want also to show to your professors
that your cluster can be used for applications,
you may want to run a real world MPI program, say,
in a research area of your college, be it computational chemistry,
weather forecast, electrical engineering, structural engineering,
fluid mechanics, genome research, seismology, etc.
Depending on which area it is,
you may find free MPI programs on the Internet.

My two cents,
Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Ankush Kaul wrote:

let me describe what i want to do.

i had taken linux clustering as my final year engineering project as i m 
really iintrested in 0networking.


to tell de truth our college does not have any professor with knowledge 
of clustering.


the aim of our project was just to make a cluster, which we did. not we 
have to show and explain our project to the professors. so i want 
somethin to show them how de cluster works... some program or 
benchmarking s/w.


hope you got the problem.
and thanks again, we really appretiate you patience.




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Prakash Velayutham

Hi Ankush,

You can get some example MPI programs from http://www.pdc.kth.se/training/Tutor/MPI/Templates/index-frame.html 
. 

You can compare the performance of these in a MPI (single processor,  
multiple processors) setting and non-MPI (serial) setting to show how  
it can help their research.


Hope that helps,
Prakash

On Apr 20, 2009, at 12:34 PM, Ankush Kaul wrote:


let me describe what i want to do.

i had taken linux clustering as my final year engineering project as  
i m really iintrested in 0networking.


to tell de truth our college does not have any professor with  
knowledge of clustering.


the aim of our project was just to make a cluster, which we did. not  
we have to show and explain our project to the professors. so i want  
somethin to show them how de cluster works... some program or  
benchmarking s/w.


hope you got the problem.
and thanks again, we really appretiate you patience.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Eugene Loh

Ankush Kaul wrote:


let me describe what i want to do.

i had taken linux clustering as my final year engineering project as i 
m really iintrested in 0networking.


to tell de truth our college does not have any professor with 
knowledge of clustering.


the aim of our project was just to make a cluster, which we did. not 
we have to show and explain our project to the professors. so i want 
somethin to show them how de cluster works... some program or 
benchmarking s/w.


When you download Open MPI software, I guess there is an examples 
subdirectory.  Maybe the example codes there would suffice to illustrate 
message passing to someone who is not familiar with it.


HPL (and the HPC Challenge test suite, which includes HPL) may involve a 
little bit of wrestling, but you're probably best off there wading 
through their README files, including following their advice about what 
to do if you encounter problems.


Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Ankush Kaul
let me describe what i want to do.

i had taken linux clustering as my final year engineering project as i m
really iintrested in 0networking.

to tell de truth our college does not have any professor with knowledge of
clustering.

the aim of our project was just to make a cluster, which we did. not we have
to show and explain our project to the professors. so i want somethin to
show them how de cluster works... some program or benchmarking s/w.

hope you got the problem.
and thanks again, we really appretiate you patience.


Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Gus Correa

Hi Ankush

Ankush Kaul wrote:

Thanks a lot, I m implementing the passwordless cluster

I m also tryin different benchmarking software n got fed up of all the 
probs in all de sofwares i try. will list few:


*1) VampirTrace*

 I extracted de tar in /vt then followed following steps



I never used it.


*$ ./configure --prefix=/vti*
 [...lots of output...]
*$ make all install*

after this the FAQ on open-mpi.org  asks to 
'/Simply replace the compiler wrappers to activate vampir trace/' but 
does not tell how do i replace the compiler wrappers.


i try to run *mpicc-vt -c hello.c -o hello
*
but it gives a error
/bash: mpicc-vt: command not found




It is not on your path.
Use the full path name, which should be /vti/bin/...  or similar
if you did install it in /vti.

Remember, "locate" is your friend!


/*2) HPL
*
for this i didnt undersatnd the installation steps.




I extracted the tar in /hpl

Then is asks to '/create a file Make. in the  top-level 
directory/' i created a file Make.i386.


Are you talking about the Netlib HPL-2.0?
http://netlib.org/benchmark/hpl/

Are your computers i386 (32-bit) or x86_64/em64t (64-bit)?
"uname -a" will tell.

Anyway, read their INSTALL file for where to find
template Make.arch files!


then it says '/This file essentially contains the compilers
 and librairies with their paths to be used/' how do i put that?

after that it asks to run command *make arch=i386
*but it gives error*
*/make[3]: Entering directory `/hpl'
make -f Make.top startup_dir arch=i386
make[4]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
include/i386
make[4]: include/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
lib
make[4]: lib: Command not found
make[4]: [startup_dir] Error 127 (ignored)
lib/i386
make[4]: lib/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
bin
make[4]: bin: Command not found
make[4]: [startup_dir] Error 127 (ignored)
bin/i386
make[4]: bin/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
make[4]: Leaving directory `/hpl'
make -f Make.top startup_src arch=i386
make[4]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
make -f Make.top leaf le=src/auxil   arch=i386
make[5]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
(  src/auxil ;  i386 )
/bin/sh: src/auxil: is a directory

/then it enters shell prompt.

Please help, is there a simpler Benchmarking software?
i dont wanna give at this point :(
*


I may have sent you the Intel MPI Benchmark link already.
Google will find it for you.

I wouldn't spend too much time benchmarking
on standard Ethernet TCP/IP.
Did you try your own programs?

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


*




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Ankush Kaul
Thanks a lot, I m implementing the passwordless cluster

I m also tryin different benchmarking software n got fed up of all the probs
in all de sofwares i try. will list few:

*1) VampirTrace*

 I extracted de tar in /vt then followed following steps

*$ ./configure --prefix=/vti*
 [...lots of output...]
*$ make all install*

after this the FAQ on open-mpi.org asks to '*Simply replace the compiler
wrappers to activate vampir trace*' but does not tell how do i replace
the compiler
wrappers.

i try to run *mpicc-vt -c hello.c -o hello
*
but it gives a error
*bash: mpicc-vt: command not found


**2) HPL
*
for this i didnt undersatnd the installation steps.

I extracted the tar in /hpl

Then is asks to '*create a file Make. in the  top-level directory*' i
created a file Make.i386.
then it says '*This file essentially contains the compilers
 and librairies with their paths to be used*' how do i put that?

after that it asks to run command *make arch=i386
*but it gives error*
**make[3]: Entering directory `/hpl'
make -f Make.top startup_dir arch=i386
make[4]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
include/i386
make[4]: include/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
lib
make[4]: lib: Command not found
make[4]: [startup_dir] Error 127 (ignored)
lib/i386
make[4]: lib/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
bin
make[4]: bin: Command not found
make[4]: [startup_dir] Error 127 (ignored)
bin/i386
make[4]: bin/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
make[4]: Leaving directory `/hpl'
make -f Make.top startup_src arch=i386
make[4]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
make -f Make.top leaf le=src/auxil   arch=i386
make[5]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
(  src/auxil ;  i386 )
/bin/sh: src/auxil: is a directory

*then it enters shell prompt.

Please help, is there a simpler Benchmarking software?
i dont wanna give at this point :(
*
*


Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Gus Correa

Hi Ankush

Please read the FAQ I sent you in the previous message.
That is the answer to your repeated question.
OpenMPI (and all MPIs that I know of) requires passwordless connections.
Your program fails because you didn't setup that.

If it worked with a single compute node,
that was most likely fortuitous,
not by design.
What you see on the screen are the ssh password messages
from your two compute nodes,
but OpenMPI (or any MPI)
won't wait for your typing passwords.
Imagine if you were running your program on 1000 nodes ...,
and,say, running the program 1000 times ...
would you really like to type all those one million passwords?
The design must be scalable.

Here is one recipe for passwordless ssh on clusters:

http://agenda.clustermonkey.net/index.php/Passwordless_SSH_Logins
http://agenda.clustermonkey.net/index.php/Passwordless_SSH_(and_RSH)_Logins

Read it carefully,
the comments about MPI(ch) 1.2 and PVM are somewhat out of date,
however, the ssh recipe is fine, detailed, and clear.
Note also the nuanced difference for NFS mounted home directories
versus separate home directories on each node.

Pay a visit to OpenSSH site also, for more information:
http://www.openssh.com/
http://en.wikipedia.org/wiki/OpenSSH

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Anush Kaul wrote:

Let me explain in detail,

when we had only 2 nodes, 1 master (192.168.67.18) + 1 compute node 
(192.168.45.65)

my openmpi-default-hostfile looked like/
192.168.67.18 slots=2
192.168.45.65 slots=2/

after this on running the command *miprun /work/Pi* on master node we got
/
# root@192.168.45.65  password :/

after entering the password the program ran on both de nodes.

Now after connecting a second compute node, and editing the hostfile:

/192.168.67.18 slots=2
192.168.45.65 slots=2/
/192.168.67.241 slots=2

/and then running the command *miprun /work/Pi* on master node we got

# root@192.168.45.65 's password: 
root@192.168.67.241 's password:


which does not accept the password.

Although we are trying to implement the passwordless cluster. i wud like 
to know what this problem is occuring?



On Sat, Apr 18, 2009 at 3:40 AM, Gus Correa > wrote:


Ankush

You need to setup passwordless connections with ssh to the node you just
added.  You (or somebody else) probably did this already on the
first compute node, otherwise the MPI programs wouldn't run
across the network.

See the very last sentence on this FAQ:

http://www.open-mpi.org/faq/?category=running#run-prereqs

And try this recipe (if you use RSA keys instead of DSA, replace all
"dsa" by "rsa"):


http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3


I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


Ankush Kaul wrote:

Thank you, i m reading up on de tools u suggested.

I am facing another problem, my cluster is working fine with 2
hosts (1 master + 1 compute node) but when i tried 2 add another
node (1 master + 2 compute node) its not working. it works fine
when i give de command mpirun -host  /work/Pi

but when i try to run
mpirun  /work/Pi it gives following error:

root@192.168.45.65 
>'s
password: root@192.168.67.241 
>'s
password:


Permission denied, please try again. 

root@192.168.45.65 
>'s password:


Permission denied, please try again.

root@192.168.45.65 
>'s password:


Permission denied (publickey,gssapi-with-mic,password).

 
Permission denied, please try again.


root@192.168.67.241 
>'s
password: [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout
in file base/pls_base_orted_cmds.c at line 275


[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1166

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
  

Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Gus Correa

Hi Ankush

Ankush Kaul wrote:

Also how can i find out where are my mpi libraries and include directories?


If you configured OpenMPI with --prefix=/some/dir they are
in /some/dir/lib and /some/dir/include,
whereas the executables (mpicc, mpiexec, etc) are in /some/dir/bin.
Otherwise OpenMPI defaults to /usr/local.

However, the preferred way to compile OpenMPI programs is to use the
OpenMPI wrappers (e.g. mpicc), and in this case you don't need to
specify the lib and include directories at all.

If you have many MPI flavors in your computers, use full path names
to avoid confusion (or carefully set the OpenMPI bin path ahead of any 
other).


The Linux command "locate" helps find things (e.g. "locate mpi.h").
You may need to update the location database
before using it with "updatedb".

I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-




On Sat, Apr 18, 2009 at 2:29 PM, Ankush Kaul > wrote:


Let me explain in detail,

when we had only 2 nodes, 1 master (192.168.67.18) + 1 compute node
(192.168.45.65)
my openmpi-default-hostfile looked like/
192.168.67.18 slots=2
192.168.45.65 slots=2/

after this on running the command *miprun /work/Pi* on master node
we got
/
# root@192.168.45.65  password :/

after entering the password the program ran on both de nodes.

Now after connecting a second compute node, and editing the hostfile:

/192.168.67.18 slots=2
192.168.45.65 slots=2/
/192.168.67.241 slots=2

/and then running the command *miprun /work/Pi* on master node we got

# root@192.168.45.65 's password:
root@192.168.67.241 's password:

which does not accept the password.

Although we are trying to implement the passwordless cluster. i wud
like to know what this problem is occuring?


On Sat, Apr 18, 2009 at 3:40 AM, Gus Correa > wrote:

Ankush

You need to setup passwordless connections with ssh to the node
you just
added.  You (or somebody else) probably did this already on the
first compute node, otherwise the MPI programs wouldn't run
across the network.

See the very last sentence on this FAQ:

http://www.open-mpi.org/faq/?category=running#run-prereqs

And try this recipe (if you use RSA keys instead of DSA, replace
all "dsa" by "rsa"):


http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3


I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


Ankush Kaul wrote:

Thank you, i m reading up on de tools u suggested.

I am facing another problem, my cluster is working fine with
2 hosts (1 master + 1 compute node) but when i tried 2 add
another node (1 master + 2 compute node) its not working. it
works fine when i give de command mpirun -host 
/work/Pi

but when i try to run
mpirun  /work/Pi it gives following error:

root@192.168.45.65 
>'s
password: root@192.168.67.241 
>'s
password:


Permission denied, please try again. 

root@192.168.45.65 
>'s
password:


Permission denied, please try again.

root@192.168.45.65 
>'s
password:


Permission denied (publickey,gssapi-with-mic,password).

 
Permission denied, please try again.


root@192.168.67.241 
>'s
password: [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG:
Timeout in file base/pls_base_orted_cmds.c at line 275


[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in
file pls_rsh_module.c at line 1166

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in
file 

Re: [OMPI users] Problem with running openMPI program

2009-04-19 Thread Ankush Kaul
Also how can i find out where are my mpi libraries and include directories?

On Sat, Apr 18, 2009 at 2:29 PM, Ankush Kaul  wrote:

> Let me explain in detail,
>
> when we had only 2 nodes, 1 master (192.168.67.18) + 1 compute node
> (192.168.45.65)
> my openmpi-default-hostfile looked like*
> 192.168.67.18 slots=2
> 192.168.45.65 slots=2*
>
> after this on running the command *miprun /work/Pi* on master node we got
> *
> # root@192.168.45.65 password :*
>
> after entering the password the program ran on both de nodes.
>
> Now after connecting a second compute node, and editing the hostfile:
>
> *192.168.67.18 slots=2
> 192.168.45.65 slots=2*
> *192.168.67.241 slots=2
>
> *and then running the command *miprun /work/Pi* on master node we got
>
> # root@192.168.45.65's password: root@192.168.67.241's password:
>
> which does not accept the password.
>
> Although we are trying to implement the passwordless cluster. i wud like to
> know what this problem is occuring?
>
>
> On Sat, Apr 18, 2009 at 3:40 AM, Gus Correa  wrote:
>
>> Ankush
>>
>> You need to setup passwordless connections with ssh to the node you just
>> added.  You (or somebody else) probably did this already on the first
>> compute node, otherwise the MPI programs wouldn't run
>> across the network.
>>
>> See the very last sentence on this FAQ:
>>
>> http://www.open-mpi.org/faq/?category=running#run-prereqs
>>
>> And try this recipe (if you use RSA keys instead of DSA, replace all "dsa"
>> by "rsa"):
>>
>>
>> http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3
>>
>> I hope this helps.
>>
>> Gus Correa
>> -
>> Gustavo Correa
>> Lamont-Doherty Earth Observatory - Columbia University
>> Palisades, NY, 10964-8000 - USA
>> -
>>
>>
>> Ankush Kaul wrote:
>>
>>> Thank you, i m reading up on de tools u suggested.
>>>
>>> I am facing another problem, my cluster is working fine with 2 hosts (1
>>> master + 1 compute node) but when i tried 2 add another node (1 master + 2
>>> compute node) its not working. it works fine when i give de command mpirun
>>> -host  /work/Pi
>>>
>>> but when i try to run
>>> mpirun  /work/Pi it gives following error:
>>>
>>> root@192.168.45.65 's password:
>>> root@192.168.67.241 's password:
>>>
>>> Permission denied, please try again. 
>>>
>>> root@192.168.45.65 's password:
>>>
>>> Permission denied, please try again.
>>>
>>> root@192.168.45.65 's password:
>>>
>>> Permission denied (publickey,gssapi-with-mic,password).
>>>
>>>
>>> Permission denied, please try again.
>>>
>>> root@192.168.67.241 's password:
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> base/pls_base_orted_cmds.c at line 275
>>>
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> pls_rsh_module.c at line 1166
>>>
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> errmgr_hnp.c at line 90
>>>
>>> [ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to
>>> start as expected.
>>>
>>> [ccomp1.cluster:03503] ERROR: There may be more information available
>>> from
>>>
>>> [ccomp1.cluster:03503] ERROR: the remote shell (see above).
>>>
>>> [ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status
>>> 255.
>>>
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> base/pls_base_orted_cmds.c at line 188
>>>
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> pls_rsh_module.c at line 1198
>>>
>>>
>>> What is the problem here?
>>>
>>>
>>> --
>>>
>>> mpirun was unable to cleanly terminate the daemons for this job. Returned
>>> value Timeout instead of ORTE_SUCCESS
>>>
>>>
>>> On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh > eugene@sun.com>> wrote:
>>>
>>>Ankush Kaul wrote:
>>>
>>>Finally, after mentioning the hostfiles the cluster is working
>>>fine. We downloaded few benchmarking softwares but i would like
>>>to know if there is any GUI based benchmarking software so that
>>>its easier to demonstrate the working of our cluster while
>>>displaying our cluster.
>>>
>>>
>>>I'm confused what you're looking for here, but thought I'd venture a
>>>suggestion.
>>>
>>>There are GUI-based performance analysis and tracing tools.  E.g.,
>>>run a program, [[semi-]automatically] collect performance data, run
>>>a GUI-based analysis tool on the data, visualize what happened on
>>>your cluster.  Would this suit your purposes?
>>>
>>>If so, there are a variety of tools out there you could try.  Some
>>>are platform-specific or cost money.  Some 

Re: [OMPI users] Problem with running openMPI program

2009-04-18 Thread Ankush Kaul
Let me explain in detail,

when we had only 2 nodes, 1 master (192.168.67.18) + 1 compute node
(192.168.45.65)
my openmpi-default-hostfile looked like*
192.168.67.18 slots=2
192.168.45.65 slots=2*

after this on running the command *miprun /work/Pi* on master node we got
*
# root@192.168.45.65 password :*

after entering the password the program ran on both de nodes.

Now after connecting a second compute node, and editing the hostfile:

*192.168.67.18 slots=2
192.168.45.65 slots=2*
*192.168.67.241 slots=2

*and then running the command *miprun /work/Pi* on master node we got

# root@192.168.45.65's password: root@192.168.67.241's password:

which does not accept the password.

Although we are trying to implement the passwordless cluster. i wud like to
know what this problem is occuring?


On Sat, Apr 18, 2009 at 3:40 AM, Gus Correa  wrote:

> Ankush
>
> You need to setup passwordless connections with ssh to the node you just
> added.  You (or somebody else) probably did this already on the first
> compute node, otherwise the MPI programs wouldn't run
> across the network.
>
> See the very last sentence on this FAQ:
>
> http://www.open-mpi.org/faq/?category=running#run-prereqs
>
> And try this recipe (if you use RSA keys instead of DSA, replace all "dsa"
> by "rsa"):
>
>
> http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3
>
> I hope this helps.
>
> Gus Correa
> -
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> -
>
>
> Ankush Kaul wrote:
>
>> Thank you, i m reading up on de tools u suggested.
>>
>> I am facing another problem, my cluster is working fine with 2 hosts (1
>> master + 1 compute node) but when i tried 2 add another node (1 master + 2
>> compute node) its not working. it works fine when i give de command mpirun
>> -host  /work/Pi
>>
>> but when i try to run
>> mpirun  /work/Pi it gives following error:
>>
>> root@192.168.45.65 's password:
>> root@192.168.67.241 's password:
>>
>> Permission denied, please try again. 
>>
>> root@192.168.45.65 's password:
>>
>> Permission denied, please try again.
>>
>> root@192.168.45.65 's password:
>>
>> Permission denied (publickey,gssapi-with-mic,password).
>>
>>
>> Permission denied, please try again.
>>
>> root@192.168.67.241 's password:
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> base/pls_base_orted_cmds.c at line 275
>>
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> pls_rsh_module.c at line 1166
>>
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> errmgr_hnp.c at line 90
>>
>> [ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to
>> start as expected.
>>
>> [ccomp1.cluster:03503] ERROR: There may be more information available from
>>
>> [ccomp1.cluster:03503] ERROR: the remote shell (see above).
>>
>> [ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status
>> 255.
>>
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> base/pls_base_orted_cmds.c at line 188
>>
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> pls_rsh_module.c at line 1198
>>
>>
>> What is the problem here?
>>
>> --
>>
>> mpirun was unable to cleanly terminate the daemons for this job. Returned
>> value Timeout instead of ORTE_SUCCESS
>>
>>
>> On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh  eugene@sun.com>> wrote:
>>
>>Ankush Kaul wrote:
>>
>>Finally, after mentioning the hostfiles the cluster is working
>>fine. We downloaded few benchmarking softwares but i would like
>>to know if there is any GUI based benchmarking software so that
>>its easier to demonstrate the working of our cluster while
>>displaying our cluster.
>>
>>
>>I'm confused what you're looking for here, but thought I'd venture a
>>suggestion.
>>
>>There are GUI-based performance analysis and tracing tools.  E.g.,
>>run a program, [[semi-]automatically] collect performance data, run
>>a GUI-based analysis tool on the data, visualize what happened on
>>your cluster.  Would this suit your purposes?
>>
>>If so, there are a variety of tools out there you could try.  Some
>>are platform-specific or cost money.  Some are widely/freely
>>available.  Examples of these tools include Intel Trace Analyzer,
>>Jumpshot, Vampir, TAU, etc.  I do know that Sun Studio (Performance
>>Analyzer) is available via free download on x86 and SPARC and Linux
>>and Solaris and works with OMPI.  Possibly the same with Jumpshot.
>> 

Re: [OMPI users] Problem with running openMPI program

2009-04-17 Thread Gus Correa

Ankush

You need to setup passwordless connections with ssh to the node you just
added.  You (or somebody else) probably did this already on the first 
compute node, otherwise the MPI programs wouldn't run

across the network.

See the very last sentence on this FAQ:

http://www.open-mpi.org/faq/?category=running#run-prereqs

And try this recipe (if you use RSA keys instead of DSA, replace all 
"dsa" by "rsa"):


http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3

I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


Ankush Kaul wrote:

Thank you, i m reading up on de tools u suggested.

I am facing another problem, my cluster is working fine with 2 hosts (1 
master + 1 compute node) but when i tried 2 add another node (1 master + 
2 compute node) its not working. it works fine when i give de command 
mpirun -host  /work/Pi


but when i try to run
mpirun  /work/Pi it gives following error:

root@192.168.45.65 's password: 
root@192.168.67.241 's password:


Permission denied, please try again. 

root@192.168.45.65 's password:

Permission denied, please try again.

root@192.168.45.65 's password:

Permission denied (publickey,gssapi-with-mic,password).

 


Permission denied, please try again.

root@192.168.67.241 's password: 
[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c at line 275


[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
pls_rsh_module.c at line 1166


[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
errmgr_hnp.c at line 90


[ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to 
start as expected.


[ccomp1.cluster:03503] ERROR: There may be more information available from

[ccomp1.cluster:03503] ERROR: the remote shell (see above).

[ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status 
255.


[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c at line 188


[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
pls_rsh_module.c at line 1198



What is the problem here?

--

mpirun was unable to cleanly terminate the daemons for this job. 
Returned value Timeout instead of ORTE_SUCCESS



On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh > wrote:


Ankush Kaul wrote:

Finally, after mentioning the hostfiles the cluster is working
fine. We downloaded few benchmarking softwares but i would like
to know if there is any GUI based benchmarking software so that
its easier to demonstrate the working of our cluster while
displaying our cluster.


I'm confused what you're looking for here, but thought I'd venture a
suggestion.

There are GUI-based performance analysis and tracing tools.  E.g.,
run a program, [[semi-]automatically] collect performance data, run
a GUI-based analysis tool on the data, visualize what happened on
your cluster.  Would this suit your purposes?

If so, there are a variety of tools out there you could try.  Some
are platform-specific or cost money.  Some are widely/freely
available.  Examples of these tools include Intel Trace Analyzer,
Jumpshot, Vampir, TAU, etc.  I do know that Sun Studio (Performance
Analyzer) is available via free download on x86 and SPARC and Linux
and Solaris and works with OMPI.  Possibly the same with Jumpshot.
 VampirTrace instrumentation is already in OMPI, but then you need
to figure out the analysis-tool part.  (I think the Vampir GUI tool
requires a license, but I'm not sure.  Maybe you can convert to TAU,
which is probably available for free download.)

Anyhow, I don't even know if that sort of thing fits your
requirements.  Just an idea.

___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-17 Thread Ankush Kaul
Thank you, i m reading up on de tools u suggested.
I am facing another problem, my cluster is working fine with 2 hosts (1
master + 1 compute node) but when i tried 2 add another node (1 master + 2
compute node) its not working. it works fine when i give de command
mpirun -host  /work/Pi

but when i try to run
mpirun  /work/Pi it gives following error:

root@192.168.45.65's password: root@192.168.67.241's password:

Permission denied, please try again. 

root@192.168.45.65's password:

Permission denied, please try again.

root@192.168.45.65's password:

Permission denied (publickey,gssapi-with-mic,password).



Permission denied, please try again.

root@192.168.67.241's password: [ccomp1.cluster:03503] [0,0,0]
ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1166

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c
at line 90

[ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to start
as expected.

[ccomp1.cluster:03503] ERROR: There may be more information available from

[ccomp1.cluster:03503] ERROR: the remote shell (see above).

[ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status
255.

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1198


What is the problem here?

--

mpirun was unable to cleanly terminate the daemons for this job. Returned
value Timeout instead of ORTE_SUCCESS

On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh  wrote:

> Ankush Kaul wrote:
>
>  Finally, after mentioning the hostfiles the cluster is working fine. We
>> downloaded few benchmarking softwares but i would like to know if there is
>> any GUI based benchmarking software so that its easier to demonstrate the
>> working of our cluster while displaying our cluster.
>>
>
> I'm confused what you're looking for here, but thought I'd venture a
> suggestion.
>
> There are GUI-based performance analysis and tracing tools.  E.g., run a
> program, [[semi-]automatically] collect performance data, run a GUI-based
> analysis tool on the data, visualize what happened on your cluster.  Would
> this suit your purposes?
>
> If so, there are a variety of tools out there you could try.  Some are
> platform-specific or cost money.  Some are widely/freely available.
>  Examples of these tools include Intel Trace Analyzer, Jumpshot, Vampir,
> TAU, etc.  I do know that Sun Studio (Performance Analyzer) is available via
> free download on x86 and SPARC and Linux and Solaris and works with OMPI.
>  Possibly the same with Jumpshot.  VampirTrace instrumentation is already in
> OMPI, but then you need to figure out the analysis-tool part.  (I think the
> Vampir GUI tool requires a license, but I'm not sure.  Maybe you can convert
> to TAU, which is probably available for free download.)
>
> Anyhow, I don't even know if that sort of thing fits your requirements.
>  Just an idea.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-14 Thread Eugene Loh

Ankush Kaul wrote:

Finally, after mentioning the hostfiles the cluster is working fine. 
We downloaded few benchmarking softwares but i would like to know if 
there is any GUI based benchmarking software so that its easier to 
demonstrate the working of our cluster while displaying our cluster.


I'm confused what you're looking for here, but thought I'd venture a 
suggestion.


There are GUI-based performance analysis and tracing tools.  E.g., run a 
program, [[semi-]automatically] collect performance data, run a 
GUI-based analysis tool on the data, visualize what happened on your 
cluster.  Would this suit your purposes?


If so, there are a variety of tools out there you could try.  Some are 
platform-specific or cost money.  Some are widely/freely available.  
Examples of these tools include Intel Trace Analyzer, Jumpshot, Vampir, 
TAU, etc.  I do know that Sun Studio (Performance Analyzer) is available 
via free download on x86 and SPARC and Linux and Solaris and works with 
OMPI.  Possibly the same with Jumpshot.  VampirTrace instrumentation is 
already in OMPI, but then you need to figure out the analysis-tool 
part.  (I think the Vampir GUI tool requires a license, but I'm not 
sure.  Maybe you can convert to TAU, which is probably available for 
free download.)


Anyhow, I don't even know if that sort of thing fits your requirements.  
Just an idea.


Re: [OMPI users] Problem with running openMPI program

2009-04-14 Thread Jeff Squyres

On Apr 14, 2009, at 2:57 AM, Ankush Kaul wrote:

Finally, after mentioning the hostfiles the cluster is working fine.  
We downloaded few benchmarking softwares but i would like to know if  
there is any GUI based benchmarking software so that its easier to  
demonstrate the working of our cluster while displaying our cluster.



There are a few, but most dump out data that can either be directly  
plotted and/or parsed and then plotted using your favorite plotting  
software.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Problem with running openMPI program

2009-04-14 Thread Ankush Kaul
Finally, after mentioning the hostfiles the cluster is working fine. We
downloaded few benchmarking softwares but i would like to know if there is
any GUI based benchmarking software so that its easier to demonstrate the
working of our cluster while displaying our cluster.
Regards
Ankush


Re: [OMPI users] Problem with running openMPI program

2009-04-13 Thread Gus Correa

Ankush Kaul wrote:


I am able to run the program on de server node, but in de compute node 
the program only runs in the directory on which the de /work is mounted 
(/work on de server contains de Pi program).


Also while running Pi it shows de process running only on server not 
compute node(using top)



Hi Ankush, list

I am not sure I understand your machine setup,
but maybe it is a "server" machine and a "compute node"
somehow connected through a network
(or directly by an Ethernet cable), right?

If that is the case, yes you will be able to launch a program with 
mpirun on the server machine, but it will only run in the compute node

if the work directory is mounted by the compute node.
This is the preferred way to run MPI programs.

If you want to run on a directory that is not exported to and mounted on
the compute node, you have to copy over all files (executable, input 
files, etc) to that directory on the compute node.

This is not as comfortable a way to run MPI programs as the alternative
above.

Moreover, you need to tell mpiexec where you want the processes to run.
There are two basic ways to do this.
You can specify the nodes on the command line with the -host option,
or you can specify them in a file with the -hostfile option.
Do "mpiexec --help" to learn the details.

I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


On Sat, Apr 11, 2009 at 1:34 PM, Ankush Kaul > wrote:


can you please suggest a simple benchmarking software, are there any
gui benchmarking softwares available?


On Tue, Apr 7, 2009 at 2:29 PM, Ankush Kaul > wrote:

Thank you sir, thanks a lot.
 
The information you provided helped us a lot. Am currently going

through the OpenMPI FAQ and will contact you in case of any doubts.
 
Regards,

Ankush Kaul






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-13 Thread Gus Correa

Hi Ankush

To test if OpenMPI works, compile and run the examples (hello_c, etc)
in the  examples/ directory (on the directory where you decompressed
the OpenMPI tarball, not where you installed OpenMPI).
Compile them with mpicc, etc, and run them with mpiexec,
all from OpenMPI.
Using full path names help avoid confusion with other
MPI flavors.

One MPI benchmark available free from Intel:

http://www.intel.com/cd/software/products/asmo-na/eng/219848.htm

There may be others though.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


Ankush Kaul wrote:
can you please suggest a simple benchmarking software, are there any gui 
benchmarking softwares available?


On Tue, Apr 7, 2009 at 2:29 PM, Ankush Kaul > wrote:


Thank you sir, thanks a lot.
 
The information you provided helped us a lot. Am currently going

through the OpenMPI FAQ and will contact you in case of any doubts.
 
Regards,

Ankush Kaul





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-11 Thread Ankush Kaul
I am able to run the program on de server node, but in de compute node the
program only runs in the directory on which the de /work is mounted (/work
on de server contains de Pi program).

Also while running Pi it shows de process running only on server not compute
node(using top)

On Sat, Apr 11, 2009 at 1:34 PM, Ankush Kaul  wrote:

> can you please suggest a simple benchmarking software, are there any gui
> benchmarking softwares available?
>
>
> On Tue, Apr 7, 2009 at 2:29 PM, Ankush Kaul wrote:
>
>> Thank you sir, thanks a lot.
>>
>> The information you provided helped us a lot. Am currently going through
>> the OpenMPI FAQ and will contact you in case of any doubts.
>>
>> Regards,
>> Ankush Kaul
>>
>
>


Re: [OMPI users] Problem with running openMPI program

2009-04-11 Thread Ankush Kaul
can you please suggest a simple benchmarking software, are there any gui
benchmarking softwares available?

On Tue, Apr 7, 2009 at 2:29 PM, Ankush Kaul  wrote:

> Thank you sir, thanks a lot.
>
> The information you provided helped us a lot. Am currently going through
> the OpenMPI FAQ and will contact you in case of any doubts.
>
> Regards,
> Ankush Kaul
>


Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Gus Correa

Hi Ankush

Ankush Kaul wrote:

I am not able to check if NFS export/mount of /tmp is working,
when i give the command *ssh 192.168.45.65 192.168.67.18* i get the 
error : bash: 192.168.67.18 : command not found




The ssh command syntax above is wrong.
Use only one IP address, which should be your remote machine's IP.

Assuming you are logged in to 192.168.67.18 (is this the master ?),
and want to ssh to 192.168.45.65 (is this the slave ?),
and run the command 'my_command' there, do:

ssh 192.168.45.65 'my_command'

If you already set up the passwordless ssh connection,
this should work.


let me explain what i understood using an example.

First, i make a folder '/work directory' on my master node.



Yes ...
... but don't use spaces in Linux/Unix names! Never!
It is either "/work"
or "/work_directory".
Using "/work directory" with a blank space in-between
is to ask for real trouble!
This is OK in Windows, but raises the hell on Linux/Unix.
In Linux/Unix blank space is a separator for everything,
so it will interpret only the first chunk of your directory name,
and think that what comes after the blank is another directory name,
or a command option, or whatever else.

You can create subdirectories there also, to put your own
programs.
Or maybe one subdirectory
for each user, and change the ownership of each subdirectory
to the corresponding user.

As root, on the master node, do:

cd /work
whoami  (this will give you your own user-name)
mkdir user-name
chown  user-name:user-name  user-name  (pay attention to the : and blanks!)

Then i mount this directory on a folder named '/work directory/mnt' on 
the slave node


is this correct?


No.
The easy thing to do is to use the same name for the mountpoint
as the original directory, say, /work only, if you called
it /work on the master node.
Again, don't use white space on Linux/Unix names!

Create a mountpoint directory called /work on the slave node:

mkdir /work

Don't populate the slave node /work directory,
as it is just a mountpoint.
Leave it empty.
Then use it to mount the actual /work directory that you
want to export from the master node.



also how and where (is it on the master node) do i give the list of 
hosts? 


On the master node, in the mpirun command line.

As I said, do "/full/path/to/openmpi/bin/mpirun --help" to get
a lot of information about the mpirun command options.



and by hosts you mean the compute nodes.




By hosts I mean whatever computers you want to run your MPI program on.
It can be the master only, the slave only, or both.

The (excellent) OpenMPI FAQ may also help you:

http://www.open-mpi.org/faq/

Many of your questions may have been answered there already.
I encourage you to read them, particularly the General Information,
Building, and Running Jobs ones.

Plez bear with me as this is the first time i am doin a project on Linux 
clustering.




Welcome, and good luck!

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

On Mon, Apr 6, 2009 at 9:27 PM, Gus Correa > wrote:


Hi Ankush

If I remember right,
mpirun will put you on your home directory, not on /tmp,
when it starts your ssh session.
To run on /tmp (or on /mnt/nfs)
you may need to use "-path" option.

Likewise, you may want to give mpirun a list of hosts (-host option)
or a hostfile (-hostfile option), to specify where you want the
program to run.

Do
"/full/path/to/openmpi/mpriun -help"
for details.

Make sure your NFS export/mount of /tmp is working,
say, by doing:

ssh slave_node 'hostname; ls /tmp; ls /mnt/nfs'

or similar, and see if your  program "pi" is really there (and where).

Actually, it may be confusing to export /tmp, as it is part
of the basic Linux directory tree,
which is the reason why you mounted it on /mnt/nfs.
You may want to choose to export/mount
a directory that is not so generic as /tmp,
so that you can use a consistent name on both computers.
For instance, you can create a /my_export or /work directory
(or whatever name you prefer) on the master node,
export it to the slave node, mount it on the slave node
with the same name/mountpoint, and use it for your MPI work.

I hope this helps.
Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-

Ankush Kaul wrote:

Thank you sir,
one more thing i am confused about, suppose i have 2 run a 'pi'
program using open mpi, where do i place the program?


Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Ankush Kaul
I am not able to check if NFS export/mount of /tmp is working,
when i give the command *ssh 192.168.45.65 192.168.67.18* i get the error :
bash: 192.168.67.18: command not found

let me explain what i understood using an example.

First, i make a folder '/work directory' on my master node.

Then i mount this directory on a folder named '/work directory/mnt' on the
slave node

is this correct?

also how and where (is it on the master node) do i give the list of hosts?
and by hosts you mean the compute nodes.

Plez bear with me as this is the first time i am doin a project on Linux
clustering.

On Mon, Apr 6, 2009 at 9:27 PM, Gus Correa  wrote:

> Hi Ankush
>
> If I remember right,
> mpirun will put you on your home directory, not on /tmp,
> when it starts your ssh session.
> To run on /tmp (or on /mnt/nfs)
> you may need to use "-path" option.
>
> Likewise, you may want to give mpirun a list of hosts (-host option)
> or a hostfile (-hostfile option), to specify where you want the
> program to run.
>
> Do
> "/full/path/to/openmpi/mpriun -help"
> for details.
>
> Make sure your NFS export/mount of /tmp is working,
> say, by doing:
>
> ssh slave_node 'hostname; ls /tmp; ls /mnt/nfs'
>
> or similar, and see if your  program "pi" is really there (and where).
>
> Actually, it may be confusing to export /tmp, as it is part
> of the basic Linux directory tree,
> which is the reason why you mounted it on /mnt/nfs.
> You may want to choose to export/mount
> a directory that is not so generic as /tmp,
> so that you can use a consistent name on both computers.
> For instance, you can create a /my_export or /work directory
> (or whatever name you prefer) on the master node,
> export it to the slave node, mount it on the slave node
> with the same name/mountpoint, and use it for your MPI work.
>
> I hope this helps.
> Gus Correa
> -
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> -
>
> Ankush Kaul wrote:
>
>> Thank you sir,
>> one more thing i am confused about, suppose i have 2 run a 'pi' program
>> using open mpi, where do i place the program?
>>
>> currently i have placed it in /tmp folder on de master node. this /tmp
>> folder is mounted on /mnt/nfs of the compute node.
>>
>> i run de progam from the tmp folder on de master node, is this correct?
>>
>> i m a newbie n really need some help, thanks in advance
>>
>> On Mon, Apr 6, 2009 at 8:43 PM, John Hearns  hear...@googlemail.com>> wrote:
>>
>>2009/4/6 Ankush Kaul >>:
>> >> Also how do i come to know that the program is using resources
>>of both the
>> > nodes?
>>
>>Log into the second node before you start the program.
>>Run 'top'
>>Seriously - top is a very, very useful utility.
>>___
>>users mailing list
>>us...@open-mpi.org 
>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> 
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Ankush Kaul
Thank you sir,
one more thing i am confused about, suppose i have 2 run a 'pi' program
using open mpi, where do i place the program?

currently i have placed it in /tmp folder on de master node. this /tmp
folder is mounted on /mnt/nfs of the compute node.

i run de progam from the tmp folder on de master node, is this correct?

i m a newbie n really need some help, thanks in advance

On Mon, Apr 6, 2009 at 8:43 PM, John Hearns  wrote:

> 2009/4/6 Ankush Kaul :
> >> Also how do i come to know that the program is using resources of both
> the
> > nodes?
>
> Log into the second node before you start the program.
> Run 'top'
> Seriously - top is a very, very useful utility.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread John Hearns
2009/4/6 Ankush Kaul :
>> Also how do i come to know that the program is using resources of both the
> nodes?

Log into the second node before you start the program.
Run 'top'
Seriously - top is a very, very useful utility.


Re: [OMPI users] Problem with running openMPI program

2009-04-04 Thread Jeff Squyres

It might be best to:

1. Setup a non-root user to run MPI applications
2. Setup SSH keys between the hosts for this non-root user so that you  
can "ssh  uptime" and not be prompted for a password/ 
passphrase


This should help.


On Apr 4, 2009, at 5:51 AM, Ankush Kaul wrote:


I followed the steps given here to setup up openMPI cluster : 
http://www.ps3cluster.umassd.edu/step3mpi.html

My cluster consists of two nodes, master(192.168.67.18) and  
salve(192.168.45.65), connected directly through a cross cable.


After setting up the cluster n configuring the master node, i  
mounted  /tmp folder of master node on the slave node(i had some  
problems with nfs at first but i worked my way out of it).


Then i copied the 'pi.c' program in the /tmp folder and successfully  
complied it, giving me a binary file 'pi'.


Now when i try to run the binary file using the following command

#mpirun –np 2 ./Pi

root@192.168.45.65's password:


after entering the password it gives the following error:

bash: orted: command not found
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/ 
pls_base_orted_cmds.c at line 275
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file  
pls_rsh_module.c at line 1166
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file  
errmgr_hnp.c at line 90
[ccomp.cluster:18963] ERROR: A daemon on node 192.168.45.65 failed  
to start as expected.
[ccomp.cluster:18963] ERROR: There may be more information available  
from

[ccomp.cluster:18963] ERROR: the remote shell (see above).
[ccomp.cluster:18963] ERROR: The daemon exited unexpectedly with  
status 127.
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/ 
pls_base_orted_cmds.c at line 188
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file  
pls_rsh_module.c at line 1198

--
mpirun was unable to cleanly terminate the daemons for this job.  
Returned value Timeout instead of ORTE_SUCCESS.

--

I am totally lost now, as this is the first time i am working on a  
cluster project, and need some help


Thank you
Ankush

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems