[OMPI users] ssh MPi and program tests
I have compiled openmpi 1.3.1 on debian amd64 lenny with icc/ifort (10.1.015) and libnuma. Tests passed: ompi_info | grep libnuma MCA affinity: libnuma (MCA v 2.0, API 2.0) ompi_info | grep maffinity MCA affinity: first use (MCA as above) MCA affinity: libnuma as above. Then, I have compiled parallel a molecular dynamics package, amber10, without error signals but I am having problems in testing the amber parallel installation. amber10 configure was set as: ./configure_amber -openmpi -nobintray ifort just as I used before with openmpi 1.2.6. Could you say if the -openmpi should be changed? cd tests export DO_PARALLEL='mpirun -np 4' make test.parallel.MM < /dev/null cd cytosine && ./Run.cytosine The authenticity of host deb64 (which is the hostname) (127.0.1.1) can't be established. RSA fingerprint . connecting ? I stopped the ssh daemon, whereby tests were interrupted because deb64 (i.e., itself) could no more be accessed. Further attempts under these conditions failed for the same reason. Now, sshing to deb64 is no more possible: port 22 closed. In contrast, sshing from deb64 to other computers occurs passwordless. No such problems arose at the time of amd64 etch with the same configuration of ssh, same compilers, and openmpi 1.2.6. I am here because the warning from the amber site is that I should to learn how to use my installation of MPI. Therefore, if there is any clue .. thanks francesco pietra
Re: [OMPI users] ssh MPi and program tests
You might first try and see if you can run something other than amber with your new installation. Make sure you have the PATH and LD_LIBRARY_PATH set correctly on the remote node, or add --prefix to your mpirun cmd line. Also, did you remember to install the OMPI 1.3 libraries on the remote nodes? One thing I see below is that host deb64 was resolved to the loopback interface - was that correct? Seems unusual - even if you are on that host, it usually would resolve to some public IP address. On Apr 6, 2009, at 8:51 AM, Francesco Pietra wrote: I have compiled openmpi 1.3.1 on debian amd64 lenny with icc/ifort (10.1.015) and libnuma. Tests passed: ompi_info | grep libnuma MCA affinity: libnuma (MCA v 2.0, API 2.0) ompi_info | grep maffinity MCA affinity: first use (MCA as above) MCA affinity: libnuma as above. Then, I have compiled parallel a molecular dynamics package, amber10, without error signals but I am having problems in testing the amber parallel installation. amber10 configure was set as: ./configure_amber -openmpi -nobintray ifort just as I used before with openmpi 1.2.6. Could you say if the -openmpi should be changed? cd tests export DO_PARALLEL='mpirun -np 4' make test.parallel.MM < /dev/null cd cytosine && ./Run.cytosine The authenticity of host deb64 (which is the hostname) (127.0.1.1) can't be established. RSA fingerprint . connecting ? I stopped the ssh daemon, whereby tests were interrupted because deb64 (i.e., itself) could no more be accessed. Further attempts under these conditions failed for the same reason. Now, sshing to deb64 is no more possible: port 22 closed. In contrast, sshing from deb64 to other computers occurs passwordless. No such problems arose at the time of amd64 etch with the same configuration of ssh, same compilers, and openmpi 1.2.6. I am here because the warning from the amber site is that I should to learn how to use my installation of MPI. Therefore, if there is any clue .. thanks francesco pietra ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] ssh MPi and program tests
Hi Francesco Did you try to run examples/connectivity_c.c, or examples/hello_c.c before trying amber? They are in the directory where you untarred the OpenMPI tarball. It is easier to troubleshoot possible network and host problems with these simpler programs. Also, to avoid confusion, you may use a full path name to mpirun, in case you have other MPI flavors in your system. Often times the mpirun your path is pointing to is not what you may think it is. I don't know if you want to run on amd64 alone (master node?) or on a cluster. In any case, you may use a list of hosts or a hostfile on the mpirun command line, to specify where you want to run. Do "/full/path/to/openmpi/bin/mpirun --help" for details. I am not familiar to amber, but how does it find your openmpi libraries and compiler wrappers? Don't you need to give it the paths during configuration, say, /configure_amber -openmpi=/full/path/to/openmpi or similar? I hope this helps. Gus Correa - Gustavo Correa Lamont-Doherty Earth Observatory - Columbia University Palisades, NY, 10964-8000 - USA - Francesco Pietra wrote: I have compiled openmpi 1.3.1 on debian amd64 lenny with icc/ifort (10.1.015) and libnuma. Tests passed: ompi_info | grep libnuma MCA affinity: libnuma (MCA v 2.0, API 2.0) ompi_info | grep maffinity MCA affinity: first use (MCA as above) MCA affinity: libnuma as above. Then, I have compiled parallel a molecular dynamics package, amber10, without error signals but I am having problems in testing the amber parallel installation. amber10 configure was set as: ./configure_amber -openmpi -nobintray ifort just as I used before with openmpi 1.2.6. Could you say if the -openmpi should be changed? cd tests export DO_PARALLEL='mpirun -np 4' make test.parallel.MM < /dev/null cd cytosine && ./Run.cytosine The authenticity of host deb64 (which is the hostname) (127.0.1.1) can't be established. RSA fingerprint . connecting ? I stopped the ssh daemon, whereby tests were interrupted because deb64 (i.e., itself) could no more be accessed. Further attempts under these conditions failed for the same reason. Now, sshing to deb64 is no more possible: port 22 closed. In contrast, sshing from deb64 to other computers occurs passwordless. No such problems arose at the time of amd64 etch with the same configuration of ssh, same compilers, and openmpi 1.2.6. I am here because the warning from the amber site is that I should to learn how to use my installation of MPI. Therefore, if there is any clue .. thanks francesco pietra ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] ssh MPi and program tests
On Mon, Apr 6, 2009 at 5:21 PM, Gus Correa wrote: > Hi Francesco > > Did you try to run examples/connectivity_c.c, > or examples/hello_c.c before trying amber? > They are in the directory where you untarred the OpenMPI tarball. > It is easier to troubleshoot > possible network and host problems > with these simpler programs. I have found the "examples". Should they be compiled? how? This is my only question here. What's below is info. Although amber parallel would have not compiled with faulty openmpi, I'll run openmpi tests as soon as I understand how. > > Also, to avoid confusion, > you may use a full path name to mpirun, > in case you have other MPI flavors in your system. > Often times the mpirun your path is pointing to is not what you > may think it is. which mpirun /usr/local/bin/mpirun there is no other accessible MPI (one application, DOT2, has mpich but it is a static compilation; DOT2 parallelizatuion requires thar the computer knows itself, i.e." ssh hostname date" should afford the date passwordless. The reported issues in testing amber have destroyed this situation: now deb64 has port22 closed, evem to itself. > > I don't know if you want to run on amd64 alone (master node?) > or on a cluster. > In any case, you may use a list of hosts > or a hostfile on the mpirun command line, > to specify where you want to run. With amber I use the parallel computer directly and the amber installation is chown to me. The ssh connection, in this case, only serves to get file from. or send files to, my desktop. In my .bashrc: (for amber) MPI_HOME=/usr/local export MPI_HOME (for openmpi) if [ "$LD_LIBRARY_PATH" ] ; then export LD_LIBRARY_PATH="$LD_LIBRARY_PATH'/usr/local/lib" else export LD_LIBRARY_PATH="/usr/local/lib" fi There is also MPICH_HOME=/usr/local export MPICH_HOME this is for DOCK, which, with this env variabl, accepts openmpi (at lest it was so with v 1.2.6) the intel compilers (compiled ifort and icc, are sourced in both my .bashrc and root home .bashrc. Thanks and apologies for my low level in these affairs. It is the first time I am faced by such problems, with amd64, same intel compilers, and openmpi 1.2.6 everything was in order. francesco > > Do "/full/path/to/openmpi/bin/mpirun --help" for details. > > I am not familiar to amber, but how does it find your openmpi > libraries and compiler wrappers? > Don't you need to give it the paths during configuration, > say, > /configure_amber -openmpi=/full/path/to/openmpi > or similar? > > I hope this helps. > Gus Correa > - > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > - > > > Francesco Pietra wrote: >> >> I have compiled openmpi 1.3.1 on debian amd64 lenny with icc/ifort >> (10.1.015) and libnuma. Tests passed: >> >> ompi_info | grep libnuma >> MCA affinity: libnuma (MCA v 2.0, API 2.0) >> >> ompi_info | grep maffinity >> MCA affinity: first use (MCA as above) >> MCA affinity: libnuma as above. >> >> Then, I have compiled parallel a molecular dynamics package, amber10, >> without error signals but I am having problems in testing the amber >> parallel installation. >> >> amber10 configure was set as: >> >> ./configure_amber -openmpi -nobintray ifort >> >> just as I used before with openmpi 1.2.6. Could you say if the >> -openmpi should be changed? >> >> cd tests >> >> export DO_PARALLEL='mpirun -np 4' >> >> make test.parallel.MM < /dev/null >> >> cd cytosine && ./Run.cytosine >> The authenticity of host deb64 (which is the hostname) (127.0.1.1) >> can't be established. >> RSA fingerprint . >> connecting ? >> >> I stopped the ssh daemon, whereby tests were interrupted because deb64 >> (i.e., itself) could no more be accessed. Further attempts under these >> conditions failed for the same reason. Now, sshing to deb64 is no more >> possible: port 22 closed. In contrast, sshing from deb64 to other >> computers occurs passwordless. No such problems arose at the time of >> amd64 etch with the same >> configuration of ssh, same compilers, and openmpi 1.2.6. >> >> I am here because the warning from the amber site is that I should to >> learn how to use my installation of MPI. Therefore, if there is any >> clue .. >> >> thanks >> francesco pietra >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] ssh MPi and program tests
Hi Francesco, list Francesco Pietra wrote: On Mon, Apr 6, 2009 at 5:21 PM, Gus Correa wrote: Hi Francesco Did you try to run examples/connectivity_c.c, or examples/hello_c.c before trying amber? They are in the directory where you untarred the OpenMPI tarball. It is easier to troubleshoot possible network and host problems with these simpler programs. I have found the "examples". Should they be compiled? how? This is my only question here. cd examples/ /full/path/to/openmpi/bin/mpicc -o connectivity_c connectivity_c.c Then run it with, say: /full/path/to/openmpi/bin/mpirun -host {whatever_hosts_you_want} -n {as_many_processes_you_want} connectivity_c Likewise for hello_c.c What's below is info. Although amber parallel would have not compiled with faulty openmpi, I'll run openmpi tests as soon as I understand how. Also, to avoid confusion, you may use a full path name to mpirun, in case you have other MPI flavors in your system. Often times the mpirun your path is pointing to is not what you may think it is. which mpirun /usr/local/bin/mpirun Did you install OpenMPI on /usr/local ? When you do "mpirun -help", do you see "mpirun (Open MPI) 1.3"? How about the output of "orte_info" ? Does it show your Intel compilers, etc? I ask because many Linux distributions come with one or more flavors of MPI (OpenMPI, MPICH, LAM, etc), some compilers also do (PGI for instance), some tools (Intel MKL?) may also have their MPI, and you end up with a bunch of MPI commands on your path that may produce a big mixup. This is a pretty common problem that affect new users on this list, on the MPICH list, on clustering lists, etc. The errors messages often don't help find the source of the problem, and people spend a lot of time trying to troubleshoot network, etc, when is often just a path problem. So, this is why when you begin, you may want to use full path names, to avoid confusion. After the basic MPI functionality is working, then you can go and fix your path chain, and rely on your path chain. there is no other accessible MPI (one application, DOT2, has mpich but it is a static compilation; DOT2 parallelizatuion requires thar the computer knows itself, i.e." ssh hostname date" should afford the date passwordless. The reported issues in testing amber have destroyed this situation: now deb64 has port22 closed, evem to itself. Have you tried to reboot the master node, to see if it comes back to the original ssh setup? You need ssh to be functional to run OpenMPI code, including the tests above. I don't know if you want to run on amd64 alone (master node?) or on a cluster. In any case, you may use a list of hosts or a hostfile on the mpirun command line, to specify where you want to run. With amber I use the parallel computer directly and the amber installation is chown to me. The ssh connection, in this case, only serves to get file from. or send files to, my desktop. It is unclear to me what you mean by "the parallel computer directly". Can you explain better which computers are in this game? Your desktop and a cluster perhaps? Are they both Debian 64 Linux? Where do you compile the programs? Where do you want to run the programs? In my .bashrc: (for amber) MPI_HOME=/usr/local export MPI_HOME (for openmpi) if [ "$LD_LIBRARY_PATH" ] ; then export LD_LIBRARY_PATH="$LD_LIBRARY_PATH'/usr/local/lib" else export LD_LIBRARY_PATH="/usr/local/lib" fi Is this on your desktop or on the "parallel computer"? There is also MPICH_HOME=/usr/local export MPICH_HOME this is for DOCK, which, with this env variabl, accepts openmpi (at lest it was so with v 1.2.6) Oh, well, it looks like there is MPICH already installed on /usr/local. So, this may be part of the confusion, the path confusion I referred to. I would suggest installing OpenMPI on a different directory, using the --prefix option of the OpenMPI configure script. Do configure --help for details about all configuration options. the intel compilers (compiled ifort and icc, are sourced in both my .bashrc and root home .bashrc. Thanks and apologies for my low level in these affairs. It is the first time I am faced by such problems, with amd64, same intel compilers, and openmpi 1.2.6 everything was in order. To me it doesn't look like the problem is related to the new version of OpenMPI. Try the test programs with full path names first. It may not solve the problem, but it may clarify things a bit. Gus Correa - Gustavo Correa Lamont-Doherty Earth Observatory - Columbia University Palisades, NY, 10964-8000 - USA - francesco Do "/full/path/to/openmpi/bin/mpirun --help" for details. I am not familiar to amber, but how does it find your openmpi libraries and compiler wrappers? Don't you need to give it the paths during configuration, say, /configure_amber -openmpi=/full/path/to/open
Re: [OMPI users] ssh MPi and program tests
Hi Gus: Partial quick answers below. I have reestablished the ssh connection so that tomorrow I'll run the tests. Everything that relates to running amber is on the "parallel computer", where I have access to everything. On Mon, Apr 6, 2009 at 7:53 PM, Gus Correa wrote: > Hi Francesco, list > > Francesco Pietra wrote: >> >> On Mon, Apr 6, 2009 at 5:21 PM, Gus Correa wrote: >>> >>> Hi Francesco >>> >>> Did you try to run examples/connectivity_c.c, >>> or examples/hello_c.c before trying amber? >>> They are in the directory where you untarred the OpenMPI tarball. >>> It is easier to troubleshoot >>> possible network and host problems >>> with these simpler programs. >> >> I have found the "examples". Should they be compiled? how? This is my >> only question here. > > cd examples/ > /full/path/to/openmpi/bin/mpicc -o connectivity_c connectivity_c.c > > Then run it with, say: > > /full/path/to/openmpi/bin/mpirun -host {whatever_hosts_you_want} > -n {as_many_processes_you_want} connectivity_c > > Likewise for hello_c.c > >> What's below is info. Although amber parallel >> would have not compiled with faulty openmpi, I'll run openmpi tests as >> soon as I understand how. >> >>> Also, to avoid confusion, >>> you may use a full path name to mpirun, >>> in case you have other MPI flavors in your system. >>> Often times the mpirun your path is pointing to is not what you >>> may think it is. >> >> >> which mpirun >> /usr/local/bin/mpirun > > Did you install OpenMPI on /usr/local ? > When you do "mpirun -help", do you see "mpirun (Open MPI) 1.3"? mpirun -help mpirun (Open MPI) 1.3.1 on the 1st line, then follow the options > How about the output of "orte_info" ? orte_info was not installed. See below what has been installed. > Does it show your Intel compilers, etc? I guess so, otherwise amber would have not been compiled, but I don't know the commands to prove it. The intel compilers are on the path: /opt/intel/cce/10.1.015/bin:/opt/intel/fce/10.1.015/bin and the mkl are sourced in .bashrc. > > I ask because many Linux distributions come with one or more flavors > of MPI (OpenMPI, MPICH, LAM, etc), some compilers also do (PGI for > instance), some tools (Intel MKL?) may also have their MPI, > and you end up with a bunch of MPI commands > on your path that may produce a big mixup. > This is a pretty common problem that affect new users on this list, > on the MPICH list, on clustering lists, etc. > The errors messages often don't help find the source of the problem, > and people spend a lot of time trying to troubleshoot network, > etc, when is often just a path problem. > > So, this is why when you begin, you may want to use full path > names, to avoid confusion. > After the basic MPI functionality is working, > then you can go and fix your path chain, > and rely on your path chain. > >> >> there is no other accessible MPI (one application, DOT2, has mpich but >> it is a static compilation; DOT2 parallelizatuion requires thar the >> computer knows itself, i.e." ssh hostname date" should afford the date >> passwordless. The reported issues in testing amber have destroyed this >> situation: now deb64 has port22 closed, evem to itself. >> > > Have you tried to reboot the master node, to see if it comes back > to the original ssh setup? > You need ssh to be functional to run OpenMPI code, > including the tests above. > >> >>> I don't know if you want to run on amd64 alone (master node?) >>> or on a cluster. >>> In any case, you may use a list of hosts >>> or a hostfile on the mpirun command line, >>> to specify where you want to run. >> >> With amber I use the parallel computer directly and the amber >> installation is chown to me. The ssh connection, in this case, only >> serves to get file from. or send files to, my desktop. >> > > It is unclear to me what you mean by "the parallel computer directly". > Can you explain better which computers are in this game? > Your desktop and a cluster perhaps? > Are they both Debian 64 Linux? > Where do you compile the programs? > Where do you want to run the programs? > >> In my .bashrc: >> >> (for amber) >> MPI_HOME=/usr/local >> export MPI_HOME >> >> (for openmpi) >> if [ "$LD_LIBRARY_PATH" ] ; then >> export LD_LIBRARY_PATH="$LD_LIBRARY_PATH'/usr/local/lib" >> else >> export LD_LIBRARY_PATH="/usr/local/lib" >> fi >> > > Is this on your desktop or on the "parallel computer"? On both "parallel computers" (there is my desktop, ssh to two uma-type dual-opteron "parallel computers". Only one was active when the "test" problems arose. While the (ten years old) destop is i386, both other machines are amd64, i.e., all debian lenny. I prepare the input files on the i386 and use it also as storage for backups. The "parallel computer" has only the X server and a minimal window for a two-dimensional graphics of amber. The other parallel computer has a GeForce 6600 card with GLSL support, which I use to elaborate graphically the outputs from the numerical computations (using VM
Re: [OMPI users] ssh MPi and program tests
Hi Francesco See answers inline. Francesco Pietra wrote: Hi Gus: Partial quick answers below. I have reestablished the ssh connection so that tomorrow I'll run the tests. Everything that relates to running amber is on the "parallel computer", where I have access to everything. On Mon, Apr 6, 2009 at 7:53 PM, Gus Correa wrote: Hi Francesco, list Francesco Pietra wrote: On Mon, Apr 6, 2009 at 5:21 PM, Gus Correa wrote: Hi Francesco Did you try to run examples/connectivity_c.c, or examples/hello_c.c before trying amber? They are in the directory where you untarred the OpenMPI tarball. It is easier to troubleshoot possible network and host problems with these simpler programs. I have found the "examples". Should they be compiled? how? This is my only question here. cd examples/ /full/path/to/openmpi/bin/mpicc -o connectivity_c connectivity_c.c Then run it with, say: /full/path/to/openmpi/bin/mpirun -host {whatever_hosts_you_want} -n {as_many_processes_you_want} connectivity_c Likewise for hello_c.c What's below is info. Although amber parallel would have not compiled with faulty openmpi, I'll run openmpi tests as soon as I understand how. Also, to avoid confusion, you may use a full path name to mpirun, in case you have other MPI flavors in your system. Often times the mpirun your path is pointing to is not what you may think it is. which mpirun /usr/local/bin/mpirun Did you install OpenMPI on /usr/local ? When you do "mpirun -help", do you see "mpirun (Open MPI) 1.3"? mpirun -help mpirun (Open MPI) 1.3.1 on the 1st line, then follow the options Ok, it looks like you installed OpenMPI 1.3.1 with the default "--prefix" which is /usr/local. How about the output of "orte_info" ? orte_info was not installed. See below what has been installed. Sorry, my fault. I meant ompi_info (not orte_info). Please try ompi_info or "ompi_info --config". It will tell you the compilers used to build OpenMPI, etc. I presume all of this is being done in the "parallel computer", i.e., in one of the AMD64 Debian systems, right? Does it show your Intel compilers, etc? I guess so, otherwise amber would have not been compiled, but I don't know the commands to prove it. The intel compilers are on the path: /opt/intel/cce/10.1.015/bin:/opt/intel/fce/10.1.015/bin and the mkl are sourced in .bashrc. Again, all in the AMD64 system, right? I ask because many Linux distributions come with one or more flavors of MPI (OpenMPI, MPICH, LAM, etc), some compilers also do (PGI for instance), some tools (Intel MKL?) may also have their MPI, and you end up with a bunch of MPI commands on your path that may produce a big mixup. This is a pretty common problem that affect new users on this list, on the MPICH list, on clustering lists, etc. The errors messages often don't help find the source of the problem, and people spend a lot of time trying to troubleshoot network, etc, when is often just a path problem. So, this is why when you begin, you may want to use full path names, to avoid confusion. After the basic MPI functionality is working, then you can go and fix your path chain, and rely on your path chain. there is no other accessible MPI (one application, DOT2, has mpich but it is a static compilation; DOT2 parallelizatuion requires thar the computer knows itself, i.e." ssh hostname date" should afford the date passwordless. The reported issues in testing amber have destroyed this situation: now deb64 has port22 closed, evem to itself. Have you tried to reboot the master node, to see if it comes back to the original ssh setup? You need ssh to be functional to run OpenMPI code, including the tests above. I don't know if you want to run on amd64 alone (master node?) or on a cluster. In any case, you may use a list of hosts or a hostfile on the mpirun command line, to specify where you want to run. With amber I use the parallel computer directly and the amber installation is chown to me. The ssh connection, in this case, only serves to get file from. or send files to, my desktop. It is unclear to me what you mean by "the parallel computer directly". Can you explain better which computers are in this game? Your desktop and a cluster perhaps? Are they both Debian 64 Linux? Where do you compile the programs? Where do you want to run the programs? In my .bashrc: (for amber) MPI_HOME=/usr/local export MPI_HOME (for openmpi) if [ "$LD_LIBRARY_PATH" ] ; then export LD_LIBRARY_PATH="$LD_LIBRARY_PATH'/usr/local/lib" else export LD_LIBRARY_PATH="/usr/local/lib" fi Is this on your desktop or on the "parallel computer"? On both "parallel computers" (there is my desktop, ssh to two uma-type dual-opteron "parallel computers". Only one was active when the "test" problems arose. While the (ten years old) destop is i386, both other machines are amd64, i.e., all debian lenny. I prepare the input files on the i386 and use it also as storage for backups. So, you only use your i386 desktop to ssh to
Re: [OMPI users] ssh MPi and program tests
Hi Gus: I should have set clear at the beginning that on the Zyxel router (connected to Internet by dynamic IP afforded by the provider) there are three computers. Their host names: deb32 (desktop debian i386) deb64 (multisocket debian amd 64 lenny) tya64 (multisocket debian amd 64 lenny) The three are ssh passwordless interconnected from the same user (myself). I never established connections as root user because I have direct access to all tree computers. So, if I slogin as user, passwordless connection is established. If I try to slogin as root user, it says that the authenticity of the host to which I intended to connect can't be established, RSA key fingerprint .. Connect? Moreover, I appended to the pub keys know to deb64 those that deb64 had sent to either deb32 or tya64. Whereby, when i command. With certain programs (conceived for batch run), the execution on deb64 is launched from deb32. ssh 192.168.#.## date (where the numbers stand for hostname) I copied /examples to my deb64 home, chown to me, compiled as user and run as user "connectivity". (I have not compild in the openmpi directory as this is to root user, while ssh has been adjusted for me as user. Running as user in my home /usr/local/bin/mpirun -deb64 -1 connectivity_c 2>&1 | tee n=1.connectivity.out it asked to add the host (himself) to the list on known hosts (on repeating the command, that was no more asked). The unabridged output: === [deb64:03575] procdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0/0 [deb64:03575] jobdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0 [deb64:03575] top: openmpi-sessions-francesco@deb64_0 [deb64:03575] tmp: /tmp [deb64:03575] mpirun: reset PATH: /usr/local/bin:/usr/local/mcce/bin:/opt/intel/cce/10.1.015/bin:/opt/intel/fce/10.1.015/bin:/home/francesco/gmmx06:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/amber10/exe:/usr/local/dock6/bin [deb64:03575] mpirun: reset LD_LIBRARY_PATH: /usr/local/lib:/opt/intel/mkl/10.0.1.014/lib/em64t:/opt/intel/cce/10.1.015/lib:/opt/intel/fce/10.1.015/lib:/usr/local/lib:/opt/acml4.1.0/gfortran64_mp_int64/lib [deb64:03583] procdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0/1 [deb64:03583] jobdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0 [deb64:03583] top: openmpi-sessions-francesco@deb64_0 [deb64:03583] tmp: /tmp [deb64:03575] [[38647,0],0] node[0].name deb64 daemon 0 arch ffc91200 [deb64:03575] [[38647,0],0] node[1].name deb64 daemon 1 arch ffc91200 [deb64:03583] [[38647,0],1] node[0].name deb64 daemon 0 arch ffc91200 [deb64:03583] [[38647,0],1] node[1].name deb64 daemon 1 arch ffc91200 -- mpirun was unable to launch the specified application as it could not find an executable: Executable: -e Node: deb64 while attempting to start process rank 0. -- [deb64:03575] sess_dir_finalize: job session dir not empty - leaving [deb64:03575] sess_dir_finalize: proc session dir not empty - leaving orterun: exiting with status -123 [deb64:03583] sess_dir_finalize: job session dir not empty - leaving = I have changed the command, setting 4 for n and giving the full path to the executable "connectivity_c" at no avail. I do not understand the message "Executable: -e" in the out file and I feel myself stupid enough in this circumstance. The ssh is working for slogin and ssh to deb 64 date gives the date passwordless, both before and after the "connectivity" run. i.e., deb64 knew, and knows, itself. The output of ompi_info between xx should probably clarify your other questions. xxx Package: Open MPI root@deb64 Distribution Open MPI: 1.3.1 Open MPI SVN revision: r20826 Open MPI release date: Mar 18, 2009 Open RTE: 1.3.1 Open RTE SVN revision: r20826 Open RTE release date: Mar 18, 2009 OPAL: 1.3.1 OPAL SVN revision: r20826 OPAL release date: Mar 18, 2009 Ident string: 1.3.1 Prefix: /usr/local Configured architecture: x86_64-unknown-linux-gnu Configure host: deb64 Configured by: root Configured on: Fri Apr 3 23:03:30 CEST 2009 Configure host: deb64 Built by: root Built on: Fri Apr 3 23:12:28 CEST 2009 Built host: deb64 C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: small C compiler: gcc C compiler absolute: /usr/bin/gcc C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: /opt/intel/fce/10.1.015/bin/ifort Fortran77 compiler abs: Fortran90 compiler: /opt/intel/fce/10.1.015/bin/ifort Fortran90 compiler abs: C profiling: yes C++ profiling: yes
Re: [OMPI users] ssh MPi and program tests
On Tue, 2009-04-07 at 11:39 +0200, Francesco Pietra wrote: > Hi Gus: > I should have set clear at the beginning that on the Zyxel router > (connected to Internet by dynamic IP afforded by the provider) there > are three computers. Their host names: > > deb32 (desktop debian i386) > > deb64 (multisocket debian amd 64 lenny) > > tya64 (multisocket debian amd 64 lenny) > > The three are ssh passwordless interconnected from the same user > (myself). I never established connections as root user because I have > direct access to all tree computers. So, if I slogin as user, > passwordless connection is established. If I try to slogin as root > user, it says that the authenticity of the host to which I intended to > connect can't be established, RSA key fingerprint .. Connect? > > Moreover, I appended to the pub keys know to deb64 those that deb64 > had sent to either deb32 or tya64. Whereby, when i command. > > With certain programs (conceived for batch run), the execution on > deb64 is launched from deb32. > > ssh 192.168.#.## date (where the numbers stand for hostname) > > > I copied /examples to my deb64 home, chown to me, compiled as user and > run as user "connectivity". (I have not compild in the openmpi > directory as this is to root user, while ssh has been adjusted for me > as user. > > Running as user in my home > > /usr/local/bin/mpirun -deb64 -1 connectivity_c 2>&1 | tee n=1.connectivity.out > > it asked to add the host (himself) to the list on known hosts (on > repeating the command, that was no more asked). The unabridged output: The easiest setup is for the executable to be accessible on all nodes, either copied or on a shared filesystem. Is that the case here? (I haven't read the whole thread, so apologies if this has already been covered.)
Re: [OMPI users] ssh MPi and program tests
Hi What are the options "-deb64" and "-1" you are passing to mpirun: > /usr/local/bin/mpirun -deb64 -1 connectivity_c 2>&1 | tee n=1.connectivity.out I don't think these are legal options for mpirun (at least they don't show up in `man mpirun`). And i think you should add a "-n 4" (for 4 processors) Furthermore, if you want to specify a host, you have to add "-host hostname1" if you want to specify several hosts you have to add "-host hostname1,hostname2,hostname3" (no spaces around the commas) Jody On Tue, Apr 7, 2009 at 11:39 AM, Francesco Pietra wrote: > Hi Gus: > I should have set clear at the beginning that on the Zyxel router > (connected to Internet by dynamic IP afforded by the provider) there > are three computers. Their host names: > > deb32 (desktop debian i386) > > deb64 (multisocket debian amd 64 lenny) > > tya64 (multisocket debian amd 64 lenny) > > The three are ssh passwordless interconnected from the same user > (myself). I never established connections as root user because I have > direct access to all tree computers. So, if I slogin as user, > passwordless connection is established. If I try to slogin as root > user, it says that the authenticity of the host to which I intended to > connect can't be established, RSA key fingerprint .. Connect? > > Moreover, I appended to the pub keys know to deb64 those that deb64 > had sent to either deb32 or tya64. Whereby, when i command. > > With certain programs (conceived for batch run), the execution on > deb64 is launched from deb32. > > ssh 192.168.#.## date (where the numbers stand for hostname) > > > I copied /examples to my deb64 home, chown to me, compiled as user and > run as user "connectivity". (I have not compild in the openmpi > directory as this is to root user, while ssh has been adjusted for me > as user. > > Running as user in my home > > /usr/local/bin/mpirun -deb64 -1 connectivity_c 2>&1 | tee n=1.connectivity.out > > it asked to add the host (himself) to the list on known hosts (on > repeating the command, that was no more asked). The unabridged output: > > === > [deb64:03575] procdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0/0 > [deb64:03575] jobdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0 > [deb64:03575] top: openmpi-sessions-francesco@deb64_0 > [deb64:03575] tmp: /tmp > [deb64:03575] mpirun: reset PATH: > /usr/local/bin:/usr/local/mcce/bin:/opt/intel/cce/10.1.015/bin:/opt/intel/fce/10.1.015/bin:/home/francesco/gmmx06:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/amber10/exe:/usr/local/dock6/bin > [deb64:03575] mpirun: reset LD_LIBRARY_PATH: > /usr/local/lib:/opt/intel/mkl/10.0.1.014/lib/em64t:/opt/intel/cce/10.1.015/lib:/opt/intel/fce/10.1.015/lib:/usr/local/lib:/opt/acml4.1.0/gfortran64_mp_int64/lib > [deb64:03583] procdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0/1 > [deb64:03583] jobdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0 > [deb64:03583] top: openmpi-sessions-francesco@deb64_0 > [deb64:03583] tmp: /tmp > [deb64:03575] [[38647,0],0] node[0].name deb64 daemon 0 arch ffc91200 > [deb64:03575] [[38647,0],0] node[1].name deb64 daemon 1 arch ffc91200 > [deb64:03583] [[38647,0],1] node[0].name deb64 daemon 0 arch ffc91200 > [deb64:03583] [[38647,0],1] node[1].name deb64 daemon 1 arch ffc91200 > -- > mpirun was unable to launch the specified application as it could not > find an executable: > > Executable: -e > Node: deb64 > > while attempting to start process rank 0. > -- > [deb64:03575] sess_dir_finalize: job session dir not empty - leaving > [deb64:03575] sess_dir_finalize: proc session dir not empty - leaving > orterun: exiting with status -123 > [deb64:03583] sess_dir_finalize: job session dir not empty - leaving > = > > I have changed the command, setting 4 for n and giving the full path > to the executable "connectivity_c" at no avail. I do not understand > the message "Executable: -e" in the out file and I feel myself stupid > enough in this circumstance. > > The ssh is working for slogin and ssh to deb 64 date gives the date > passwordless, both before and after the "connectivity" run. i.e., > deb64 knew, and knows, itself. > > The output of ompi_info between xx should probably clarify > your other questions. > > xxx > Package: Open MPI root@deb64 Distribution > Open MPI: 1.3.1 > Open MPI SVN revision: r20826 > Open MPI release date: Mar 18, 2009 > Open RTE: 1.3.1 > Open RTE SVN revision: r20826 > Open RTE release date: Mar 18, 2009 > OPAL: 1.3.1 > OPAL SVN revision: r20826 > OPAL release date: Mar 18, 2009 > Ident string: 1.3.1 > Prefix: /usr/local > Configured architecture: x86_64-unknown-linux-gnu > Configure host: deb64 > Configured by: root >
Re: [OMPI users] ssh MPi and program tests
Hi Jody: I should only blame myself. Gustavo's indications were clear. Still, I misunderstood them. Since I am testing on one node (where everything is there) mpirun -host deb64 -n 4 connectivity_c Connectivity test on 4 processes PASSED thanks francesco On Tue, Apr 7, 2009 at 12:27 PM, jody wrote: > Hi > > What are the options "-deb64" and "-1" you are passing to mpirun: >> /usr/local/bin/mpirun -deb64 -1 connectivity_c 2>&1 | tee >> n=1.connectivity.out > > I don't think these are legal options for mpirun (at least they don't > show up in `man mpirun`). > And i think you should add a "-n 4" (for 4 processors) > Furthermore, if you want to specify a host, you have to add "-host hostname1" > if you want to specify several hosts you have to add "-host > hostname1,hostname2,hostname3" (no spaces around the commas) > > Jody > > > On Tue, Apr 7, 2009 at 11:39 AM, Francesco Pietra > wrote: >> Hi Gus: >> I should have set clear at the beginning that on the Zyxel router >> (connected to Internet by dynamic IP afforded by the provider) there >> are three computers. Their host names: >> >> deb32 (desktop debian i386) >> >> deb64 (multisocket debian amd 64 lenny) >> >> tya64 (multisocket debian amd 64 lenny) >> >> The three are ssh passwordless interconnected from the same user >> (myself). I never established connections as root user because I have >> direct access to all tree computers. So, if I slogin as user, >> passwordless connection is established. If I try to slogin as root >> user, it says that the authenticity of the host to which I intended to >> connect can't be established, RSA key fingerprint .. Connect? >> >> Moreover, I appended to the pub keys know to deb64 those that deb64 >> had sent to either deb32 or tya64. Whereby, when i command. >> >> With certain programs (conceived for batch run), the execution on >> deb64 is launched from deb32. >> >> ssh 192.168.#.## date (where the numbers stand for hostname) >> >> >> I copied /examples to my deb64 home, chown to me, compiled as user and >> run as user "connectivity". (I have not compild in the openmpi >> directory as this is to root user, while ssh has been adjusted for me >> as user. >> >> Running as user in my home >> >> /usr/local/bin/mpirun -deb64 -1 connectivity_c 2>&1 | tee >> n=1.connectivity.out >> >> it asked to add the host (himself) to the list on known hosts (on >> repeating the command, that was no more asked). The unabridged output: >> >> === >> [deb64:03575] procdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0/0 >> [deb64:03575] jobdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0 >> [deb64:03575] top: openmpi-sessions-francesco@deb64_0 >> [deb64:03575] tmp: /tmp >> [deb64:03575] mpirun: reset PATH: >> /usr/local/bin:/usr/local/mcce/bin:/opt/intel/cce/10.1.015/bin:/opt/intel/fce/10.1.015/bin:/home/francesco/gmmx06:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/amber10/exe:/usr/local/dock6/bin >> [deb64:03575] mpirun: reset LD_LIBRARY_PATH: >> /usr/local/lib:/opt/intel/mkl/10.0.1.014/lib/em64t:/opt/intel/cce/10.1.015/lib:/opt/intel/fce/10.1.015/lib:/usr/local/lib:/opt/acml4.1.0/gfortran64_mp_int64/lib >> [deb64:03583] procdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0/1 >> [deb64:03583] jobdir: /tmp/openmpi-sessions-francesco@deb64_0/38647/0 >> [deb64:03583] top: openmpi-sessions-francesco@deb64_0 >> [deb64:03583] tmp: /tmp >> [deb64:03575] [[38647,0],0] node[0].name deb64 daemon 0 arch ffc91200 >> [deb64:03575] [[38647,0],0] node[1].name deb64 daemon 1 arch ffc91200 >> [deb64:03583] [[38647,0],1] node[0].name deb64 daemon 0 arch ffc91200 >> [deb64:03583] [[38647,0],1] node[1].name deb64 daemon 1 arch ffc91200 >> -- >> mpirun was unable to launch the specified application as it could not >> find an executable: >> >> Executable: -e >> Node: deb64 >> >> while attempting to start process rank 0. >> -- >> [deb64:03575] sess_dir_finalize: job session dir not empty - leaving >> [deb64:03575] sess_dir_finalize: proc session dir not empty - leaving >> orterun: exiting with status -123 >> [deb64:03583] sess_dir_finalize: job session dir not empty - leaving >> = >> >> I have changed the command, setting 4 for n and giving the full path >> to the executable "connectivity_c" at no avail. I do not understand >> the message "Executable: -e" in the out file and I feel myself stupid >> enough in this circumstance. >> >> The ssh is working for slogin and ssh to deb 64 date gives the date >> passwordless, both before and after the "connectivity" run. i.e., >> deb64 knew, and knows, itself. >> >> The output of ompi_info between xx should probably clarify >> your other questions. >> >> xxx >> Package: Open MPI root@deb64 Distribution >> Open MPI: 1.3.1 >> Open MPI SVN revision: r20826 >> Open MPI re
Re: [OMPI users] ssh MPi and program tests
* Francesco Pietra [2009 04 06, 16:51]: > cd cytosine && ./Run.cytosine > The authenticity of host deb64 (which is the hostname) (127.0.1.1) > can't be established. > RSA fingerprint . > connecting ? This is a warning from ssh, not from OpenMPI; probably it is the first time the system tries to connect to itself, and is asking you a confirmation to continue. Please note that 127.0.1.1 seems quite strange to me, since the 'standard' ip for localhost is '127.0.0.1'. You may want to check your /etc/hosts . > I stopped the ssh daemon, whereby tests were interrupted because deb64 > (i.e., itself) could no more be accessed. I'm afraid it wasn't a great idea... the ssh daemon is required to receive connections to localhost; and since mpi wants to do just that, stopping sshd won't really fix the issue ;)
Re: [OMPI users] ssh MPi and program tests
With amd64 etch, intel compilers 10, and openmpi 1.2.6 I had no problem in compiling amber10. Having changed to amd64 lenny, amber10 did no more pass installation tets. I was unable to recompile amber10. I upgraded openmpi 10 1.3.1, which passed all its tests, but again was unable to recompile amber10. Now I have purged everything about intel (compilers and mkl) and I am installing version 11. I'll recompile openmpi 1.3.1 on these. If amber10 still refuses to compile, i'll abandon intel for gnu compilers and math libraries. I'll come to your questions. There was no misprint in what I wrote but at the moment i am unable to do better. All issues about ssh were resolved. Actually, there was no issue. I created the issues and apologize for that. thanks francesco On Wed, Apr 8, 2009 at 10:28 AM, Marco wrote: > * Francesco Pietra [2009 04 06, 16:51]: >> cd cytosine && ./Run.cytosine >> The authenticity of host deb64 (which is the hostname) (127.0.1.1) >> can't be established. >> RSA fingerprint . >> connecting ? > > This is a warning from ssh, not from OpenMPI; probably it is the first > time the system tries to connect to itself, and is asking you a > confirmation to continue. > > Please note that 127.0.1.1 seems quite strange to me, since the > 'standard' ip for localhost is '127.0.0.1'. You may want to check your > /etc/hosts . > >> I stopped the ssh daemon, whereby tests were interrupted because deb64 >> (i.e., itself) could no more be accessed. > > I'm afraid it wasn't a great idea... the ssh daemon is required to > receive connections to localhost; and since mpi wants to do just that, > stopping sshd won't really fix the issue ;) > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >