Also how can i find out where are my mpi libraries and include directories?
On Sat, Apr 18, 2009 at 2:29 PM, Ankush Kaul <ankush.rk...@gmail.com> wrote: > Let me explain in detail, > > when we had only 2 nodes, 1 master (192.168.67.18) + 1 compute node > (192.168.45.65) > my openmpi-default-hostfile looked like* > 192.168.67.18 slots=2 > 192.168.45.65 slots=2* > > after this on running the command *miprun /work/Pi* on master node we got > * > # root@192.168.45.65 password :* > > after entering the password the program ran on both de nodes. > > Now after connecting a second compute node, and editing the hostfile: > > *192.168.67.18 slots=2 > 192.168.45.65 slots=2* > *192.168.67.241 slots=2 > > *and then running the command *miprun /work/Pi* on master node we got > > # root@192.168.45.65's password: root@192.168.67.241's password: > > which does not accept the password. > > Although we are trying to implement the passwordless cluster. i wud like to > know what this problem is occuring? > > > On Sat, Apr 18, 2009 at 3:40 AM, Gus Correa <g...@ldeo.columbia.edu> wrote: > >> Ankush >> >> You need to setup passwordless connections with ssh to the node you just >> added. You (or somebody else) probably did this already on the first >> compute node, otherwise the MPI programs wouldn't run >> across the network. >> >> See the very last sentence on this FAQ: >> >> http://www.open-mpi.org/faq/?category=running#run-prereqs >> >> And try this recipe (if you use RSA keys instead of DSA, replace all "dsa" >> by "rsa"): >> >> >> http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3 >> >> I hope this helps. >> >> Gus Correa >> --------------------------------------------------------------------- >> Gustavo Correa >> Lamont-Doherty Earth Observatory - Columbia University >> Palisades, NY, 10964-8000 - USA >> --------------------------------------------------------------------- >> >> >> Ankush Kaul wrote: >> >>> Thank you, i m reading up on de tools u suggested. >>> >>> I am facing another problem, my cluster is working fine with 2 hosts (1 >>> master + 1 compute node) but when i tried 2 add another node (1 master + 2 >>> compute node) its not working. it works fine when i give de command mpirun >>> -host <hostname> /work/Pi >>> >>> but when i try to run >>> mpirun /work/Pi it gives following error: >>> >>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password: >>> root@192.168.67.241 <mailto:root@192.168.67.241>'s password: >>> >>> Permission denied, please try again. <The password i provide is correct> >>> >>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password: >>> >>> Permission denied, please try again. >>> >>> root@192.168.45.65 <mailto:root@192.168.45.65>'s password: >>> >>> Permission denied (publickey,gssapi-with-mic,password). >>> >>> >>> Permission denied, please try again. >>> >>> root@192.168.67.241 <mailto:root@192.168.67.241>'s password: >>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file >>> base/pls_base_orted_cmds.c at line 275 >>> >>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file >>> pls_rsh_module.c at line 1166 >>> >>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file >>> errmgr_hnp.c at line 90 >>> >>> [ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to >>> start as expected. >>> >>> [ccomp1.cluster:03503] ERROR: There may be more information available >>> from >>> >>> [ccomp1.cluster:03503] ERROR: the remote shell (see above). >>> >>> [ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status >>> 255. >>> >>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file >>> base/pls_base_orted_cmds.c at line 188 >>> >>> [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file >>> pls_rsh_module.c at line 1198 >>> >>> >>> What is the problem here? >>> >>> >>> -------------------------------------------------------------------------- >>> >>> mpirun was unable to cleanly terminate the daemons for this job. Returned >>> value Timeout instead of ORTE_SUCCESS >>> >>> >>> On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh <eugene....@sun.com <mailto: >>> eugene....@sun.com>> wrote: >>> >>> Ankush Kaul wrote: >>> >>> Finally, after mentioning the hostfiles the cluster is working >>> fine. We downloaded few benchmarking softwares but i would like >>> to know if there is any GUI based benchmarking software so that >>> its easier to demonstrate the working of our cluster while >>> displaying our cluster. >>> >>> >>> I'm confused what you're looking for here, but thought I'd venture a >>> suggestion. >>> >>> There are GUI-based performance analysis and tracing tools. E.g., >>> run a program, [[semi-]automatically] collect performance data, run >>> a GUI-based analysis tool on the data, visualize what happened on >>> your cluster. Would this suit your purposes? >>> >>> If so, there are a variety of tools out there you could try. Some >>> are platform-specific or cost money. Some are widely/freely >>> available. Examples of these tools include Intel Trace Analyzer, >>> Jumpshot, Vampir, TAU, etc. I do know that Sun Studio (Performance >>> Analyzer) is available via free download on x86 and SPARC and Linux >>> and Solaris and works with OMPI. Possibly the same with Jumpshot. >>> VampirTrace instrumentation is already in OMPI, but then you need >>> to figure out the analysis-tool part. (I think the Vampir GUI tool >>> requires a license, but I'm not sure. Maybe you can convert to TAU, >>> which is probably available for free download.) >>> >>> Anyhow, I don't even know if that sort of thing fits your >>> requirements. Just an idea. >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > >