Hi Francesco, list

Francesco Pietra wrote:
On Mon, Apr 6, 2009 at 5:21 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:
Hi Francesco

Did you try to run examples/connectivity_c.c,
or examples/hello_c.c before trying amber?
They are in the directory where you untarred the OpenMPI tarball.
It is easier to troubleshoot
possible network and host problems
with these simpler programs.

I have found the "examples". Should they be compiled? how? This is my
only question here.

cd examples/
/full/path/to/openmpi/bin/mpicc -o connectivity_c connectivity_c.c

Then run it with, say:

/full/path/to/openmpi/bin/mpirun -host {whatever_hosts_you_want}
-n {as_many_processes_you_want} connectivity_c

Likewise for hello_c.c

What's below is info. Although amber parallel
would have not compiled with faulty openmpi, I'll run openmpi tests as
soon as I understand how.

Also, to avoid confusion,
you may use a full path name to mpirun,
in case you have other MPI flavors in your system.
Often times the mpirun your path is pointing to is not what you
may think it is.


which mpirun
/usr/local/bin/mpirun

Did you install OpenMPI on /usr/local ?
When you do "mpirun -help", do you see "mpirun (Open MPI) 1.3"?
How about the output of "orte_info" ?
Does it show your Intel compilers, etc?

I ask because many Linux distributions come with one or more flavors
of MPI (OpenMPI, MPICH, LAM, etc), some compilers also do (PGI for instance), some tools (Intel MKL?) may also have their MPI,
and you end up with a bunch of MPI commands
on your path that may produce a big mixup.
This is a pretty common problem that affect new users on this list,
on the MPICH list, on clustering lists, etc.
The errors messages often don't help find the source of the problem,
and people spend a lot of time trying to troubleshoot network,
etc, when is often just a path problem.

So, this is why when you begin, you may want to use full path
names, to avoid confusion.
After the basic MPI functionality is working,
then you can go and fix your path chain,
and rely on your path chain.


there is no other accessible MPI (one application, DOT2, has mpich but
it is a static compilation; DOT2 parallelizatuion requires thar the
computer knows itself, i.e." ssh hostname date" should afford the date
passwordless. The reported issues in testing amber have destroyed this
situation: now deb64 has port22 closed, evem to itself.


Have you tried to reboot the master node, to see if it comes back
to the original ssh setup?
You need ssh to be functional to run OpenMPI code,
including the tests above.


I don't know if you want to run on amd64 alone (master node?)
or on a cluster.
In any case, you may use a list of hosts
or a hostfile on the mpirun command line,
to specify where you want to run.

With amber I use the parallel computer directly and the amber
installation is chown to me. The ssh connection, in this case, only
serves to get file from. or send files to, my desktop.


It is unclear to me what you mean by "the parallel computer directly".
Can you explain better which computers are in this game?
Your desktop and a cluster perhaps?
Are they both Debian 64 Linux?
Where do you compile the programs?
Where do you want to run the programs?

In my .bashrc:

(for amber)
MPI_HOME=/usr/local
export MPI_HOME

(for openmpi)
if [ "$LD_LIBRARY_PATH" ] ; then
  export LD_LIBRARY_PATH="$LD_LIBRARY_PATH'/usr/local/lib"
else
  export LD_LIBRARY_PATH="/usr/local/lib"
fi


Is this on your desktop or on the "parallel computer"?


There is also

MPICH_HOME=/usr/local
export MPICH_HOME

this is for DOCK, which, with this env variabl, accepts openmpi (at
lest it was so with v 1.2.6)


Oh, well, it looks like there is MPICH already installed on /usr/local.
So, this may be part of the confusion, the path confusion I referred to.

I would suggest installing OpenMPI on a different directory,
using the --prefix option of the OpenMPI configure script.
Do configure --help for details about all configuration options.


the intel compilers (compiled ifort and icc, are sourced in both my
.bashrc and root home .bashrc.

Thanks and apologies for my low level in these affairs. It is the
first time I am faced by such problems, with amd64, same intel
compilers, and openmpi 1.2.6 everything was in order.


To me it doesn't look like the problem is related to the new version
of OpenMPI.

Try the test programs with full path names first.
It may not solve the problem, but it may clarify things a bit.

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

francesco



Do "/full/path/to/openmpi/bin/mpirun --help" for details.

I am not familiar to amber, but how does it find your openmpi
libraries and compiler wrappers?
Don't you need to give it the paths during configuration,
say,
/configure_amber -openmpi=/full/path/to/openmpi
or similar?

I hope this helps.
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


Francesco Pietra wrote:
I have compiled openmpi 1.3.1 on debian amd64 lenny with icc/ifort
(10.1.015) and libnuma. Tests passed:

ompi_info | grep libnuma
 MCA affinity: libnuma (MCA v 2.0, API 2.0)

ompi_info | grep maffinity
 MCA affinity: first use (MCA as above)
 MCA affinity: libnuma as above.

Then, I have compiled parallel a molecular dynamics package, amber10,
without error signals but I am having problems in testing the amber
parallel installation.

amber10 configure was set as:

./configure_amber -openmpi -nobintray ifort

just as I used before with openmpi 1.2.6. Could you say if the
-openmpi should be changed?

cd tests

export DO_PARALLEL='mpirun -np 4'

make test.parallel.MM  < /dev/null

cd cytosine && ./Run.cytosine
The authenticity of host deb64 (which is the hostname) (127.0.1.1)
can't be established.
RSA fingerprint .....
connecting ?

I stopped the ssh daemon, whereby tests were interrupted because deb64
(i.e., itself) could no more be accessed. Further attempts under these
conditions failed for the same reason. Now, sshing to deb64 is no more
possible: port 22 closed. In contrast, sshing from deb64 to other
computers occurs passwordless. No such problems arose at the time of
amd64 etch with the same
configuration of ssh, same compilers, and openmpi 1.2.6.

I am here because the warning from the amber site is that I should to
learn how to use my installation of MPI. Therefore, if there is any
clue ..

thanks
francesco pietra
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to