Re: [OMPI users] Problem with running openMPI program

2009-04-29 Thread Ankush Kaul
@Gus

the applications in the links u have sent are really high level n i believe
really expensive too as i will have 2 have a physical apparatus for various
measurements along with the cluster. Am i right?


Re: [OMPI users] Problem with running openMPI program

2009-04-29 Thread Ankush Kaul
Are there any application that i can implement on a small level, in a lab or
something???

Also what do for clustering web servers?


On Wed, Apr 29, 2009 at 2:46 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Hi Ankush
>
> Glad to hear that your MPI and cluster project were successful.
>
> I don't know if you would call these "mathematical computation"
> or "real life applications" of MPI and clusters, but here are a
> few samples I am familiar with (Earth Science):
>
> Weather forecast:
> http://www.wrf-model.org/index.php
> http://www.mmm.ucar.edu/mm5/
>
> Climate, Atmosphere and Ocean circulation modeling:
> http://www.ccsm.ucar.edu/models/ccsm3.0/
> http://www.jamstec.go.jp/esc/index.en.html
> http://www.metoffice.gov.uk/climatechange/
> http://www.gfdl.noaa.gov/fms
> http://www.nemo-ocean.eu/
>
> Earthquakes, computational seismology, and solid Earth dynamics:
> http://www.gps.caltech.edu/~jtromp/research/index.html<http://www.gps.caltech.edu/%7Ejtromp/research/index.html>
> http://www-esd.lbl.gov/GG/CCS/
>
> A couple of other areas:
>
> Computational Fluid Dynamics, Finite Element Method, etc:
> http://www.foamcfd.org/
> http://www.cimec.org.ar/twiki/bin/view/Cimec/PETScFEM
>
> Computational Chemistry, molecular dynamics, etc:
> http://www.tddft.org/programs/octopus/wiki/index.php/Main_Page
> http://classic.chem.msu.su/gran/gamess/
> http://ambermd.org/
> http://www.gromacs.org/
> http://www.charmm.org/
>
> Gus Correa
>
>
> Ankush Kaul wrote:
>
>> Thanks everyone(esp Gus and Jeff) for the support and guidance. We are
>> almost at the verge of completing our project which could have not been
>> possible without all u guys.
>>
>> I would like to know one more thing, what are real life applications that
>> i can use the cluster for (except mathematical computation)? Can i use if
>> for my web server, if yes then how?
>>
>>
>>
>> On Fri, Apr 24, 2009 at 12:01 AM, Jeff Squyres <jsquy...@cisco.com> jsquy...@cisco.com>> wrote:
>>
>>Excellent answer.  One addendum -- we had a really nice FAQ entry
>>about this kind of stuff on the LAM/MPI web site, which I was
>>horrified to see that we had not copied to the Open MPI site.  So I
>>copied it over this morning.  :-)
>>
>>Have a look at these 3 FAQ (brand new) entries:
>>
>>
>> http://www.open-mpi.org/faq/?category=building#overwrite-pre-installed-ompi
>>   http://www.open-mpi.org/faq/?category=building#where-to-install
>>
>> http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem
>>
>>Hope that helps.
>>
>>
>>
>>
>>On Apr 23, 2009, at 10:34 AM, Gus Correa wrote:
>>
>>Hi Ankush
>>
>>Jeff already sent clarifications about image processing,
>>and the portable API nature of OpenMPI (and other MPI
>>implementations).
>>
>>As for "mpicc: command not found" this is again a problem with your
>>PATH.
>>Remember the "locate" command?  :)
>>Find where mpicc is installed, and put that directory on your PATH.
>>
>>In any case, I would suggest that you choose a central NFS mounted
>>file system on your cluster master node, and install OpenMPI there,
>>configuring and building it from source (not from yum).
>>If this directory is mounted on all nodes, the same OpenMPI will be
>>available on all nodes.
>>This will give you a single standard version of OpenMPI across
>>the board.
>>
>>Clustering can become a very confusing and tricky business if you
>>have heterogeneous nodes, with different OS/Linux versions,
>>different MPI versions etc, software installed in different
>>locations
>>on each node, etc, regardless of whether you use mpiselector or
>>you set the PATH variable on each node, or you use environment
>>modules
>>package, or any other technique to setup your environment.
>>Installing less software, rather than more software,
>>and doing so in a standardized homogeneous way across all
>>cluster nodes,
>>will give you a cleaner environment, which is easier to understand,
>>control, upgrade, and update.
>>
>>A relatively simple way to install a homogeneous cluster is
>>to use the Rocks Clusters "rolls" suite,
>>which is free and based on CentOS.
>>It will 

Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Ankush Kaul
i have gone through that course, but i still am not at a stage where i can
develop a MPI program, so was looking for some IP programs on the net.

Will try the 
imageproc.c<http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/imageproc.c>program
which i found
http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/

hope it runs on openmpi.



On Thu, Apr 23, 2009 at 5:07 PM, Jeff Squyres <jsquy...@cisco.com> wrote:

> Yes, they will run.  Note that these are toy image processing examples;
> they are no substitute for a real image processing application.
>
> You might want to look at a full MPI tutorial to get an understanding of
> MPI itself:
>
>  http://ci-tutor.ncsa.uiuc.edu/login.php
>
> Register (it's free), login, and look for the Introduction to MPI tutorial.
>  It's quite good.
>
>
>
>
> On Apr 23, 2009, at 6:59 AM, Ankush Kaul wrote:
>
>  I found some programs on this link :
>> http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/
>>
>> will these program run on my openmpi cluster?
>>
>> actually i want to run some image processing program on my cluster, as i
>> cannot write the entire code of the program can anyone tell where can i get
>> ip programs.
>>
>> I know this is the wrong place to ask but thought would give it a try as i
>> cannot find anything on the net.
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Ankush Kaul
I found some programs on this link :
http://lam-mpi.lzu.edu.cn/tutorials/nd/part1/

will these program run on my openmpi cluster?

actually i want to run some image processing program on my cluster, as i
cannot write the entire code of the program can anyone tell where can i get
ip programs.

I know this is the wrong place to ask but thought would give it a try as i
cannot find anything on the net.


Re: [OMPI users] Problem with running openMPI program

2009-04-23 Thread Ankush Kaul
@Gus, Eugene
I read all you mails and even followed the same procedure, it was blas that
was giving the problem.

Thanks

I am again stuck on a problem, i connected a new node to my cluster and
installed CentOS 5.2 on it. after that i use yum to install
openmpi,openmpi-libs and openmpi-devel sucessfully.

But still when i run mpicc command it gives me error :
*bash: mpicc: command not found*

i found out there is a command *mpi-selector* but dont know hoe to use it.
Is this a new version of openmpi? how do i configure it?


Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Ankush Kaul
@gus
we are not able to make hpl sucessfully.

i think it has to do something with blas

i cannot find blas tar file on the net, i found rpm but de installation
steps is with tar file.

#*locate blas* gave us the following result

*[root@ccomp1 hpl]# locate blas
/hpl/include/hpl_blas.h
/hpl/makes/Make.blas
/hpl/src/blas
/hpl/src/blas/HPL_daxpy.c
/hpl/src/blas/HPL_dcopy.c
/hpl/src/blas/HPL_dgemm.c
/hpl/src/blas/HPL_dgemv.c
/hpl/src/blas/HPL_dger.c
/hpl/src/blas/HPL_dscal.c
/hpl/src/blas/HPL_dswap.c
/hpl/src/blas/HPL_dtrsm.c
/hpl/src/blas/HPL_dtrsv.c
/hpl/src/blas/HPL_idamax.c
/hpl/src/blas/ccomp
/hpl/src/blas/i386
/hpl/src/blas/ccomp/Make.inc
/hpl/src/blas/ccomp/Makefile
/hpl/src/blas/i386/Make.inc
/hpl/src/blas/i386/Makefile
/usr/include/boost/numeric/ublas
/usr/include/boost/numeric/ublas/banded.hpp
/usr/include/boost/numeric/ublas/blas.hpp
/usr/include/boost/numeric/ublas/detail
/usr/include/boost/numeric/ublas/exception.hpp
/usr/include/boost/numeric/ublas/expression_types.hpp
/usr/include/boost/numeric/ublas/functional.hpp
/usr/include/boost/numeric/ublas/fwd.hpp
/usr/include/boost/numeric/ublas/hermitian.hpp
/usr/include/boost/numeric/ublas/io.hpp
/usr/include/boost/numeric/ublas/lu.hpp
/usr/include/boost/numeric/ublas/matrix.hpp
/usr/include/boost/numeric/ublas/matrix_expression.hpp
/usr/include/boost/numeric/ublas/matrix_proxy.hpp
/usr/include/boost/numeric/ublas/matrix_sparse.hpp
/usr/include/boost/numeric/ublas/operation.hpp
/usr/include/boost/numeric/ublas/operation_blocked.hpp
/usr/include/boost/numeric/ublas/operation_sparse.hpp
/usr/include/boost/numeric/ublas/storage.hpp
/usr/include/boost/numeric/ublas/storage_sparse.hpp
/usr/include/boost/numeric/ublas/symmetric.hpp
/usr/include/boost/numeric/ublas/traits.hpp
/usr/include/boost/numeric/ublas/triangular.hpp
/usr/include/boost/numeric/ublas/vector.hpp
/usr/include/boost/numeric/ublas/vector_expression.hpp
/usr/include/boost/numeric/ublas/vector_of_vector.hpp
/usr/include/boost/numeric/ublas/vector_proxy.hpp
/usr/include/boost/numeric/ublas/vector_sparse.hpp
/usr/include/boost/numeric/ublas/detail/concepts.hpp
/usr/include/boost/numeric/ublas/detail/config.hpp
/usr/include/boost/numeric/ublas/detail/definitions.hpp
/usr/include/boost/numeric/ublas/detail/documentation.hpp
/usr/include/boost/numeric/ublas/detail/duff.hpp
/usr/include/boost/numeric/ublas/detail/iterator.hpp
/usr/include/boost/numeric/ublas/detail/matrix_assign.hpp
/usr/include/boost/numeric/ublas/detail/raw.hpp
/usr/include/boost/numeric/ublas/detail/returntype_deduction.hpp
/usr/include/boost/numeric/ublas/detail/temporary.hpp
/usr/include/boost/numeric/ublas/detail/vector_assign.hpp
/usr/lib/libblas.so.3
/usr/lib/libblas.so.3.1
/usr/lib/libblas.so.3.1.1
/usr/lib/openoffice.org/basis3.0/share/gallery/htmlexpo/cublast.gif
/usr/lib/openoffice.org/basis3.0/share/gallery/htmlexpo/cublast_.gif
/usr/share/backgrounds/images/tiny_blast_of_red.jpg
/usr/share/doc/blas-3.1.1
/usr/share/doc/blas-3.1.1/blasqr.ps
/usr/share/man/manl/intro_blas1.l.gz*

When we try to make using the following command
*# make arch=ccomp*
**
it gives error :
*Makefile:47: Make.inc: No such file or directory
make[2]: *** No rule to make target `Make.inc'.  Stop.
make[2]: Leaving directory `/hpl/src/auxil/ccomp'
make[1]: *** [build_src] Error 2
make[1]: Leaving directory `/hpl'
make: *** [build] Error 2*
**
*ccomp* folder is created but *xhpl* file is not created
is it some prob with de config file?




On Wed, Apr 22, 2009 at 11:40 AM, Ankush Kaul <ankush.rk...@gmail.com>wrote:

> i feel the above problem occured due 2 installing mpich package, now even
> nomal mpi programs are not running.
> What should we do? we even tried *yum remove mpich* but it says no
> packages to remove.
> Please Help!!!
>
>   On Wed, Apr 22, 2009 at 11:34 AM, Ankush Kaul <ankush.rk...@gmail.com>wrote:
>
>> We are facing another problem, we were tryin to install different
>> benchmarking packages
>>
>> now whenever we try to run *mpirun* command (which was working perfectly
>> before) we get this error:
>> *usr/local/bin/mpdroot: open failed for root's mpd conf filempdtrace
>> (__init__ 1190): forked process failed; status=255*
>>
>> whats the problem here?
>>
>>
>>
>> On Tue, Apr 21, 2009 at 11:45 PM, Gus Correa <g...@ldeo.columbia.edu>wrote:
>>
>>> Hi Ankush
>>>
>>> Ankush Kaul wrote:
>>>
>>>> @Eugene
>>>> they are ok but we wanted something better, which would more clearly
>>>> show de diff in using a single pc and the cluster.
>>>>
>>>> @Prakash
>>>> i had prob with running de programs as they were compiling using mpcc n
>>>> not mpicc
>>>>
>>>> @gus
>>>> we are tryin 2 figure out de hpl config, its quite complicated,
>>>>
>>>
&g

Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Ankush Kaul
i feel the above problem occured due 2 installing mpich package, now even
nomal mpi programs are not running.
What should we do? we even tried *yum remove mpich* but it says no packages
to remove.
Please Help!!!

On Wed, Apr 22, 2009 at 11:34 AM, Ankush Kaul <ankush.rk...@gmail.com>wrote:

> We are facing another problem, we were tryin to install different
> benchmarking packages
>
> now whenever we try to run *mpirun* command (which was working perfectly
> before) we get this error:
> *usr/local/bin/mpdroot: open failed for root's mpd conf filempdtrace
> (__init__ 1190): forked process failed; status=255*
>
> whats the problem here?
>
>
>
> On Tue, Apr 21, 2009 at 11:45 PM, Gus Correa <g...@ldeo.columbia.edu>wrote:
>
>> Hi Ankush
>>
>> Ankush Kaul wrote:
>>
>>> @Eugene
>>> they are ok but we wanted something better, which would more clearly show
>>> de diff in using a single pc and the cluster.
>>>
>>> @Prakash
>>> i had prob with running de programs as they were compiling using mpcc n
>>> not mpicc
>>>
>>> @gus
>>> we are tryin 2 figure out de hpl config, its quite complicated,
>>>
>>
>> I sent you some sketchy instructions to build HPL,
>> on my last message to this thread.
>> I built HPL and run it here yesterday that way.
>> Did you try my suggestions?
>> Where did you get stuck?
>>
>> also de locate command lists lots of confusing results.
>>>
>>>
>> I would say the list is just long, not really confusing.
>> You can  find what you need if you want.
>> Pipe the output of locate through "more", and search carefully.
>> If you are talking about BLAS try "locate libblas.a" and
>> "locate libgoto.a".
>> Those are the libraries you need, and if they are not there
>> you need to install one of them.
>> Read my previous email for details.
>> I hope it will help you get HPL working, if you are interested on HPL.
>>
>> I hope this helps.
>>
>> Gus Correa
>> -
>> Gustavo Correa
>> Lamont-Doherty Earth Observatory - Columbia University
>> Palisades, NY, 10964-8000 - USA
>> -
>>
>>  @jeff
>>> i think u are correct we may have installed openmpi without VT support,
>>> but is there anythin we can do now???
>>>
>>> One more thing I found this program but dont know how to run it :
>>> http://www.cis.udel.edu/~pollock/367/manual/node35.html
>>>
>>> Thanks 2 all u guys 4 putting in so much efforts to help us out.
>>>
>>>
>>> 
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>


Re: [OMPI users] Problem with running openMPI program

2009-04-22 Thread Ankush Kaul
We are facing another problem, we were tryin to install different
benchmarking packages

now whenever we try to run *mpirun* command (which was working perfectly
before) we get this error:
*usr/local/bin/mpdroot: open failed for root's mpd conf filempdtrace
(__init__ 1190): forked process failed; status=255*

whats the problem here?



On Tue, Apr 21, 2009 at 11:45 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Hi Ankush
>
> Ankush Kaul wrote:
>
>> @Eugene
>> they are ok but we wanted something better, which would more clearly show
>> de diff in using a single pc and the cluster.
>>
>> @Prakash
>> i had prob with running de programs as they were compiling using mpcc n
>> not mpicc
>>
>> @gus
>> we are tryin 2 figure out de hpl config, its quite complicated,
>>
>
> I sent you some sketchy instructions to build HPL,
> on my last message to this thread.
> I built HPL and run it here yesterday that way.
> Did you try my suggestions?
> Where did you get stuck?
>
> also de locate command lists lots of confusing results.
>>
>>
> I would say the list is just long, not really confusing.
> You can  find what you need if you want.
> Pipe the output of locate through "more", and search carefully.
> If you are talking about BLAS try "locate libblas.a" and
> "locate libgoto.a".
> Those are the libraries you need, and if they are not there
> you need to install one of them.
> Read my previous email for details.
> I hope it will help you get HPL working, if you are interested on HPL.
>
> I hope this helps.
>
> Gus Correa
> -
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> -
>
>  @jeff
>> i think u are correct we may have installed openmpi without VT support,
>> but is there anythin we can do now???
>>
>> One more thing I found this program but dont know how to run it :
>> http://www.cis.udel.edu/~pollock/367/manual/node35.html
>>
>> Thanks 2 all u guys 4 putting in so much efforts to help us out.
>>
>>
>> 
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-21 Thread Ankush Kaul
@Eugene
they are ok but we wanted something better, which would more clearly show de
diff in using a single pc and the cluster.

@Prakash
i had prob with running de programs as they were compiling using mpcc n not
mpicc

@gus
we are tryin 2 figure out de hpl config, its quite complicated, also de
locate command lists lots of confusing results.

@jeff
i think u are correct we may have installed openmpi without VT support, but
is there anythin we can do now???

One more thing I found this program but dont know how to run it :
http://www.cis.udel.edu/~pollock/367/manual/node35.html

Thanks 2 all u guys 4 putting in so much efforts to help us out.


Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Ankush Kaul
let me describe what i want to do.

i had taken linux clustering as my final year engineering project as i m
really iintrested in 0networking.

to tell de truth our college does not have any professor with knowledge of
clustering.

the aim of our project was just to make a cluster, which we did. not we have
to show and explain our project to the professors. so i want somethin to
show them how de cluster works... some program or benchmarking s/w.

hope you got the problem.
and thanks again, we really appretiate you patience.


Re: [OMPI users] Problem with running openMPI program

2009-04-20 Thread Ankush Kaul
Thanks a lot, I m implementing the passwordless cluster

I m also tryin different benchmarking software n got fed up of all the probs
in all de sofwares i try. will list few:

*1) VampirTrace*

 I extracted de tar in /vt then followed following steps

*$ ./configure --prefix=/vti*
 [...lots of output...]
*$ make all install*

after this the FAQ on open-mpi.org asks to '*Simply replace the compiler
wrappers to activate vampir trace*' but does not tell how do i replace
the compiler
wrappers.

i try to run *mpicc-vt -c hello.c -o hello
*
but it gives a error
*bash: mpicc-vt: command not found


**2) HPL
*
for this i didnt undersatnd the installation steps.

I extracted the tar in /hpl

Then is asks to '*create a file Make. in the  top-level directory*' i
created a file Make.i386.
then it says '*This file essentially contains the compilers
 and librairies with their paths to be used*' how do i put that?

after that it asks to run command *make arch=i386
*but it gives error*
**make[3]: Entering directory `/hpl'
make -f Make.top startup_dir arch=i386
make[4]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
include/i386
make[4]: include/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
lib
make[4]: lib: Command not found
make[4]: [startup_dir] Error 127 (ignored)
lib/i386
make[4]: lib/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
bin
make[4]: bin: Command not found
make[4]: [startup_dir] Error 127 (ignored)
bin/i386
make[4]: bin/i386: Command not found
make[4]: [startup_dir] Error 127 (ignored)
make[4]: Leaving directory `/hpl'
make -f Make.top startup_src arch=i386
make[4]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
make -f Make.top leaf le=src/auxil   arch=i386
make[5]: Entering directory `/hpl'
Make.top:161: warning: overriding commands for target `clean_arch_all'
Make.i386:84: warning: ignoring old commands for target `clean_arch_all'
(  src/auxil ;  i386 )
/bin/sh: src/auxil: is a directory

*then it enters shell prompt.

Please help, is there a simpler Benchmarking software?
i dont wanna give at this point :(
*
*


Re: [OMPI users] Problem with running openMPI program

2009-04-19 Thread Ankush Kaul
Also how can i find out where are my mpi libraries and include directories?

On Sat, Apr 18, 2009 at 2:29 PM, Ankush Kaul <ankush.rk...@gmail.com> wrote:

> Let me explain in detail,
>
> when we had only 2 nodes, 1 master (192.168.67.18) + 1 compute node
> (192.168.45.65)
> my openmpi-default-hostfile looked like*
> 192.168.67.18 slots=2
> 192.168.45.65 slots=2*
>
> after this on running the command *miprun /work/Pi* on master node we got
> *
> # root@192.168.45.65 password :*
>
> after entering the password the program ran on both de nodes.
>
> Now after connecting a second compute node, and editing the hostfile:
>
> *192.168.67.18 slots=2
> 192.168.45.65 slots=2*
> *192.168.67.241 slots=2
>
> *and then running the command *miprun /work/Pi* on master node we got
>
> # root@192.168.45.65's password: root@192.168.67.241's password:
>
> which does not accept the password.
>
> Although we are trying to implement the passwordless cluster. i wud like to
> know what this problem is occuring?
>
>
> On Sat, Apr 18, 2009 at 3:40 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:
>
>> Ankush
>>
>> You need to setup passwordless connections with ssh to the node you just
>> added.  You (or somebody else) probably did this already on the first
>> compute node, otherwise the MPI programs wouldn't run
>> across the network.
>>
>> See the very last sentence on this FAQ:
>>
>> http://www.open-mpi.org/faq/?category=running#run-prereqs
>>
>> And try this recipe (if you use RSA keys instead of DSA, replace all "dsa"
>> by "rsa"):
>>
>>
>> http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3
>>
>> I hope this helps.
>>
>> Gus Correa
>> -----
>> Gustavo Correa
>> Lamont-Doherty Earth Observatory - Columbia University
>> Palisades, NY, 10964-8000 - USA
>> -
>>
>>
>> Ankush Kaul wrote:
>>
>>> Thank you, i m reading up on de tools u suggested.
>>>
>>> I am facing another problem, my cluster is working fine with 2 hosts (1
>>> master + 1 compute node) but when i tried 2 add another node (1 master + 2
>>> compute node) its not working. it works fine when i give de command mpirun
>>> -host  /work/Pi
>>>
>>> but when i try to run
>>> mpirun  /work/Pi it gives following error:
>>>
>>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password:
>>> root@192.168.67.241 <mailto:root@192.168.67.241>'s password:
>>>
>>> Permission denied, please try again. 
>>>
>>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password:
>>>
>>> Permission denied, please try again.
>>>
>>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password:
>>>
>>> Permission denied (publickey,gssapi-with-mic,password).
>>>
>>>
>>> Permission denied, please try again.
>>>
>>> root@192.168.67.241 <mailto:root@192.168.67.241>'s password:
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> base/pls_base_orted_cmds.c at line 275
>>>
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> pls_rsh_module.c at line 1166
>>>
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> errmgr_hnp.c at line 90
>>>
>>> [ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to
>>> start as expected.
>>>
>>> [ccomp1.cluster:03503] ERROR: There may be more information available
>>> from
>>>
>>> [ccomp1.cluster:03503] ERROR: the remote shell (see above).
>>>
>>> [ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status
>>> 255.
>>>
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> base/pls_base_orted_cmds.c at line 188
>>>
>>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>> pls_rsh_module.c at line 1198
>>>
>>>
>>> What is the problem here?
>>>
>>>
>>> --
>>>
>>> mpirun was unable to cleanly terminate the daemons for this job. Returned
>>> value Timeout instead of ORTE_SUCCESS
>>>
>>>
>>> On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh <eugene@

Re: [OMPI users] Problem with running openMPI program

2009-04-18 Thread Ankush Kaul
Let me explain in detail,

when we had only 2 nodes, 1 master (192.168.67.18) + 1 compute node
(192.168.45.65)
my openmpi-default-hostfile looked like*
192.168.67.18 slots=2
192.168.45.65 slots=2*

after this on running the command *miprun /work/Pi* on master node we got
*
# root@192.168.45.65 password :*

after entering the password the program ran on both de nodes.

Now after connecting a second compute node, and editing the hostfile:

*192.168.67.18 slots=2
192.168.45.65 slots=2*
*192.168.67.241 slots=2

*and then running the command *miprun /work/Pi* on master node we got

# root@192.168.45.65's password: root@192.168.67.241's password:

which does not accept the password.

Although we are trying to implement the passwordless cluster. i wud like to
know what this problem is occuring?


On Sat, Apr 18, 2009 at 3:40 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Ankush
>
> You need to setup passwordless connections with ssh to the node you just
> added.  You (or somebody else) probably did this already on the first
> compute node, otherwise the MPI programs wouldn't run
> across the network.
>
> See the very last sentence on this FAQ:
>
> http://www.open-mpi.org/faq/?category=running#run-prereqs
>
> And try this recipe (if you use RSA keys instead of DSA, replace all "dsa"
> by "rsa"):
>
>
> http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3
>
> I hope this helps.
>
> Gus Correa
> -
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> -
>
>
> Ankush Kaul wrote:
>
>> Thank you, i m reading up on de tools u suggested.
>>
>> I am facing another problem, my cluster is working fine with 2 hosts (1
>> master + 1 compute node) but when i tried 2 add another node (1 master + 2
>> compute node) its not working. it works fine when i give de command mpirun
>> -host  /work/Pi
>>
>> but when i try to run
>> mpirun  /work/Pi it gives following error:
>>
>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password:
>> root@192.168.67.241 <mailto:root@192.168.67.241>'s password:
>>
>> Permission denied, please try again. 
>>
>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password:
>>
>> Permission denied, please try again.
>>
>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password:
>>
>> Permission denied (publickey,gssapi-with-mic,password).
>>
>>
>> Permission denied, please try again.
>>
>> root@192.168.67.241 <mailto:root@192.168.67.241>'s password:
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> base/pls_base_orted_cmds.c at line 275
>>
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> pls_rsh_module.c at line 1166
>>
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> errmgr_hnp.c at line 90
>>
>> [ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to
>> start as expected.
>>
>> [ccomp1.cluster:03503] ERROR: There may be more information available from
>>
>> [ccomp1.cluster:03503] ERROR: the remote shell (see above).
>>
>> [ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status
>> 255.
>>
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> base/pls_base_orted_cmds.c at line 188
>>
>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> pls_rsh_module.c at line 1198
>>
>>
>> What is the problem here?
>>
>> --
>>
>> mpirun was unable to cleanly terminate the daemons for this job. Returned
>> value Timeout instead of ORTE_SUCCESS
>>
>>
>> On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh <eugene@sun.com > eugene@sun.com>> wrote:
>>
>>Ankush Kaul wrote:
>>
>>Finally, after mentioning the hostfiles the cluster is working
>>fine. We downloaded few benchmarking softwares but i would like
>>to know if there is any GUI based benchmarking software so that
>>its easier to demonstrate the working of our cluster while
>>displaying our cluster.
>>
>>
>>I'm confused what you're looking for here, but thought I'd venture a
>>suggestion.
>>
>>There are GUI-based performance analysis and tracing tools.  E.g.,
>>run a pro

Re: [OMPI users] Problem with running openMPI program

2009-04-17 Thread Ankush Kaul
Thank you, i m reading up on de tools u suggested.
I am facing another problem, my cluster is working fine with 2 hosts (1
master + 1 compute node) but when i tried 2 add another node (1 master + 2
compute node) its not working. it works fine when i give de command
mpirun -host  /work/Pi

but when i try to run
mpirun  /work/Pi it gives following error:

root@192.168.45.65's password: root@192.168.67.241's password:

Permission denied, please try again. 

root@192.168.45.65's password:

Permission denied, please try again.

root@192.168.45.65's password:

Permission denied (publickey,gssapi-with-mic,password).



Permission denied, please try again.

root@192.168.67.241's password: [ccomp1.cluster:03503] [0,0,0]
ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1166

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c
at line 90

[ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to start
as expected.

[ccomp1.cluster:03503] ERROR: There may be more information available from

[ccomp1.cluster:03503] ERROR: the remote shell (see above).

[ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status
255.

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1198


What is the problem here?

--

mpirun was unable to cleanly terminate the daemons for this job. Returned
value Timeout instead of ORTE_SUCCESS

On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh <eugene@sun.com> wrote:

> Ankush Kaul wrote:
>
>  Finally, after mentioning the hostfiles the cluster is working fine. We
>> downloaded few benchmarking softwares but i would like to know if there is
>> any GUI based benchmarking software so that its easier to demonstrate the
>> working of our cluster while displaying our cluster.
>>
>
> I'm confused what you're looking for here, but thought I'd venture a
> suggestion.
>
> There are GUI-based performance analysis and tracing tools.  E.g., run a
> program, [[semi-]automatically] collect performance data, run a GUI-based
> analysis tool on the data, visualize what happened on your cluster.  Would
> this suit your purposes?
>
> If so, there are a variety of tools out there you could try.  Some are
> platform-specific or cost money.  Some are widely/freely available.
>  Examples of these tools include Intel Trace Analyzer, Jumpshot, Vampir,
> TAU, etc.  I do know that Sun Studio (Performance Analyzer) is available via
> free download on x86 and SPARC and Linux and Solaris and works with OMPI.
>  Possibly the same with Jumpshot.  VampirTrace instrumentation is already in
> OMPI, but then you need to figure out the analysis-tool part.  (I think the
> Vampir GUI tool requires a license, but I'm not sure.  Maybe you can convert
> to TAU, which is probably available for free download.)
>
> Anyhow, I don't even know if that sort of thing fits your requirements.
>  Just an idea.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-14 Thread Ankush Kaul
Finally, after mentioning the hostfiles the cluster is working fine. We
downloaded few benchmarking softwares but i would like to know if there is
any GUI based benchmarking software so that its easier to demonstrate the
working of our cluster while displaying our cluster.
Regards
Ankush


Re: [OMPI users] Problem with running openMPI program

2009-04-11 Thread Ankush Kaul
I am able to run the program on de server node, but in de compute node the
program only runs in the directory on which the de /work is mounted (/work
on de server contains de Pi program).

Also while running Pi it shows de process running only on server not compute
node(using top)

On Sat, Apr 11, 2009 at 1:34 PM, Ankush Kaul <ankush.rk...@gmail.com> wrote:

> can you please suggest a simple benchmarking software, are there any gui
> benchmarking softwares available?
>
>
> On Tue, Apr 7, 2009 at 2:29 PM, Ankush Kaul <ankush.rk...@gmail.com>wrote:
>
>> Thank you sir, thanks a lot.
>>
>> The information you provided helped us a lot. Am currently going through
>> the OpenMPI FAQ and will contact you in case of any doubts.
>>
>> Regards,
>> Ankush Kaul
>>
>
>


Re: [OMPI users] Problem with running openMPI program

2009-04-11 Thread Ankush Kaul
can you please suggest a simple benchmarking software, are there any gui
benchmarking softwares available?

On Tue, Apr 7, 2009 at 2:29 PM, Ankush Kaul <ankush.rk...@gmail.com> wrote:

> Thank you sir, thanks a lot.
>
> The information you provided helped us a lot. Am currently going through
> the OpenMPI FAQ and will contact you in case of any doubts.
>
> Regards,
> Ankush Kaul
>


Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Ankush Kaul
I am not able to check if NFS export/mount of /tmp is working,
when i give the command *ssh 192.168.45.65 192.168.67.18* i get the error :
bash: 192.168.67.18: command not found

let me explain what i understood using an example.

First, i make a folder '/work directory' on my master node.

Then i mount this directory on a folder named '/work directory/mnt' on the
slave node

is this correct?

also how and where (is it on the master node) do i give the list of hosts?
and by hosts you mean the compute nodes.

Plez bear with me as this is the first time i am doin a project on Linux
clustering.

On Mon, Apr 6, 2009 at 9:27 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Hi Ankush
>
> If I remember right,
> mpirun will put you on your home directory, not on /tmp,
> when it starts your ssh session.
> To run on /tmp (or on /mnt/nfs)
> you may need to use "-path" option.
>
> Likewise, you may want to give mpirun a list of hosts (-host option)
> or a hostfile (-hostfile option), to specify where you want the
> program to run.
>
> Do
> "/full/path/to/openmpi/mpriun -help"
> for details.
>
> Make sure your NFS export/mount of /tmp is working,
> say, by doing:
>
> ssh slave_node 'hostname; ls /tmp; ls /mnt/nfs'
>
> or similar, and see if your  program "pi" is really there (and where).
>
> Actually, it may be confusing to export /tmp, as it is part
> of the basic Linux directory tree,
> which is the reason why you mounted it on /mnt/nfs.
> You may want to choose to export/mount
> a directory that is not so generic as /tmp,
> so that you can use a consistent name on both computers.
> For instance, you can create a /my_export or /work directory
> (or whatever name you prefer) on the master node,
> export it to the slave node, mount it on the slave node
> with the same name/mountpoint, and use it for your MPI work.
>
> I hope this helps.
> Gus Correa
> -
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> -
>
> Ankush Kaul wrote:
>
>> Thank you sir,
>> one more thing i am confused about, suppose i have 2 run a 'pi' program
>> using open mpi, where do i place the program?
>>
>> currently i have placed it in /tmp folder on de master node. this /tmp
>> folder is mounted on /mnt/nfs of the compute node.
>>
>> i run de progam from the tmp folder on de master node, is this correct?
>>
>> i m a newbie n really need some help, thanks in advance
>>
>> On Mon, Apr 6, 2009 at 8:43 PM, John Hearns <hear...@googlemail.com> hear...@googlemail.com>> wrote:
>>
>>2009/4/6 Ankush Kaul <ankush.rk...@gmail.com
>><mailto:ankush.rk...@gmail.com>>:
>> >> Also how do i come to know that the program is using resources
>>of both the
>> > nodes?
>>
>>Log into the second node before you start the program.
>>Run 'top'
>>Seriously - top is a very, very useful utility.
>>___
>>users mailing list
>>us...@open-mpi.org <mailto:us...@open-mpi.org>
>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> 
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Ankush Kaul
Thank you sir,
one more thing i am confused about, suppose i have 2 run a 'pi' program
using open mpi, where do i place the program?

currently i have placed it in /tmp folder on de master node. this /tmp
folder is mounted on /mnt/nfs of the compute node.

i run de progam from the tmp folder on de master node, is this correct?

i m a newbie n really need some help, thanks in advance

On Mon, Apr 6, 2009 at 8:43 PM, John Hearns <hear...@googlemail.com> wrote:

> 2009/4/6 Ankush Kaul <ankush.rk...@gmail.com>:
> >> Also how do i come to know that the program is using resources of both
> the
> > nodes?
>
> Log into the second node before you start the program.
> Run 'top'
> Seriously - top is a very, very useful utility.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>