Re: [OMPI users] Problems installing in Cygwin - Problem with GCC 3.4.4

2008-11-04 Thread Terry Frankcombe
> *** Fortran 90/95 compiler
> checking whether we are using the GNU Fortran compiler... yes
> checking whether g95 accepts -g... yes
> checking if Fortran compiler works... yes
> checking whether g95 and g95 compilers are compatible... no
> configure: WARNING: *** Fortran 77 and Fortran 90 compilers are not
> link compatible
> configure: WARNING: *** Disabling MPI Fortran 90/95 bindings


OK, for that one I think you need to dig into config.log and see exactly
what's failing and why.

I can't speak for the developers, but it seems slightly concerning that
configure thinks it's using "the GNU Fortran compiler".  I feel sure the
GNU people would object to g95 being called that.



[OMPI users] question regarding the configuration of multiple nics for openmpi

2008-11-04 Thread Olivier Marsden

Hello,
I am configuring a cluster with multiple nics for use with open mpi.
I have not found very much information on the best way of setting up
my network for open mpi. At the moment I have a pretty standard setup
with a single hostname and single ip address for each node.
Could someone advise me on the following points?
- for each node, should I have the second ip on the same subnet as the 
first, or not ?

- does openmpi need separate hostnames for each ip?
If there is a webpage describing how to configure such a network for the 
best, that

would be great.

Many thanks,

Olivier Marsden


[OMPI users] mca btl_openib_flags default value

2008-11-04 Thread Gilbert Grosdidier
Bonjour,

 Working with OpenMPI 1.2.5 on RHEL5.2, I noticed a weird default value for 
this mca parameter, as printed by ompi_info:

MCA btl: parameter "btl_openib_flags" (current value: "54")
BTL flags, added together: SEND=1, PUT=2, GET=4 (cannot be 0)

 Is this expected or not ?
I could understand any value between 1 & 7, but what does mean 54, please ?
Does it behave like 6, if removal of the unexpected bits ?

 Thanks,Gilbert

-- 
*-*
  Gilbert Grosdidier gilbert.grosdid...@in2p3.fr
  LAL / IN2P3 / CNRS Phone : +33 1 6446 8909
  Faculté des Sciences, Bat. 200 Fax   : +33 1 6446 8546
  B.P. 34, F-91898 Orsay Cedex (FRANCE)
 -


[OMPI users] OpenMPI-1.2.7 + SGE

2008-11-04 Thread Sangamesh B
Hi all,

 In Rocks-5.0 cluster, OpenMPI-1.2.6 comes by default. I guess it
gets installed through rpm.

# /opt/openmpi/bin/ompi_info | grep gridengine
 MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.6)
 MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.6)

 Now I've to install OpenMPI-1.2.7. The "./configure --help | grep
gridengine" - doesn't show anything.

 In such scenario how OpenMPI-1.2.7 can be integrated to SGE?

After achieving this integration:

 1. Is it possible to use -machinefile option in the SGE script?

   Eg:
 #$ -pe orte 4

/opt/openmpi/bin/mpirun -machinefile $TMPDIR/machines -np
4 

 2. If "qstat -f" is showing 2 slots on node1 and 2 slots on node2
for a 4 process openmpi job, then will these processes run exactly  on
those nodes?

# qconf -sp orte
pe_name   orte
slots 999
user_listsNONE
xuser_lists   NONE
start_proc_args   /bin/true
stop_proc_args/bin/true
allocation_rule   $fill_up
control_slavesTRUE
job_is_first_task FALSE
urgency_slots min


Thank you,
Sangamesh
Consultant - HPC


Re: [OMPI users] question regarding the configuration of multiple nics for openmpi

2008-11-04 Thread Gus Correa

Hi Olivier and list

I presume you are talking about Ethernet or GigE.
The basic information on  how to launch jobs is on the OpenMPI FAQ pages:

http://www.open-mpi.org/faq/?category=tcp
http://www.open-mpi.org/faq/?category=tcp#tcp-selection

Here is what I did on our toy/test cluster made of salvaged computers.

1) I use ROCKS cluster, which makes some steps more automatic then 
described below.

However, ROCKS is not needed for this.

2) I have actually three private networks, but you may use, say, two,
if your motherboards have dual Ethernet (or GigE) ports.
Each node has three NICs, which Linux recognized and activated as eth0, 
eth1, eth2.


Make sure you and Linux agree on which port is eth0, eth1, etc.
This may be a  bit tricky, the kernel seems to have its own wisdom and 
mood when it assigns the port names.
Ping, lspci, ifconfig, ifup, ifdown, ethtool, are your friends here, and 
can help you

sort out the correct port-name map.

3) For a modest number of nodes, less than 8, you can buy inexpensive 
SOHO type GigE switches,

one for each network, for about $50 a piece. (This is what I did.)
For more nodes you would need larger switches.
Use Cat5e or Cat6 Ethernet cables and connect the separate networks 
using the correct ports on the

nodes and switches.
Well, you may have done that already ...

4) On RHEL or Fedora the essential information is on 
/etc/sysconfig/network-scripts/ifcfg-eth[0,1,2],

on each of your cluster nodes.
Other Linux distributions may have equivalent files.
You need to edit these files to insert the correct IP address, netmask, 
and MAC address.


For instance, if you have less than 254 nodes, you can define private 
networks like this:

net1) 192.168.1.0 netmask 255.255.255.0  (using the eth0 port)
net2) 192.168.2.0 netmask 255.255.255.0 (using the eth1 port)
net3) 192.168.3.0 netmask 255.255.255.0 (using the eth2 port)
etc.

Here is an example:

[node1] $ cat /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
HWADDR=(put your eth0 port MAC address here)
IPADDR=192.168.1.1   ( ... 192.168.1.2 on node2, etc)
NETMASK=255.255.255.0
BOOTPROTO=none
ONBOOT=yes

[node1] $ cat /etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth1
HWADDR=(put your eth1 port MAC address here)
IPADDR=192.168.2.1 ( ... 192.168.2.2 on node2, etc)
NETMASK=255.255.255.0
BOOTPROTO=none
ONBOOT=yes


5) To launch the OpenMPI program "mp_prog"
using the 192.168.2.0 (i.e. "eth1") network using, say, 8 processes, do:

mpiexec --mca btl_tcp_if_include eth1 -n 8 my_prog

(Good if your 192.168.1.0 (eth0) network is already used for I/O, 
control, etc.)


To be more aggressive, and use both networks,
192.168.1.0 ("eth0") and 192.168.2.0 ("eth1")  do:

mpiexec --mca btl_tcp_if_include eth0,eth1 -n 8 my_prog

***

Works for me.
I hope it helps!

Gus Correa
PS - More answers below.

--
-
Gustavo J. Ponce Correa, PhD - Email: g...@ldeo.columbia.edu
Lamont-Doherty Earth Observatory - Columbia University
P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
-


Olivier Marsden wrote:


Hello,
I am configuring a cluster with multiple nics for use with open mpi.
I have not found very much information on the best way of setting up
my network for open mpi. At the moment I have a pretty standard setup
with a single hostname and single ip address for each node.
Could someone advise me on the following points?
- for each node, should I have the second ip on the same subnet as the 
first, or not ?


No, use separate subnets.



- does openmpi need separate hostnames for each ip?


No, same hostname, but different subnets and different IPs for each port 
on a given host.




If there is a webpage describing how to configure such a network for 
the best, that

would be great.


Yes, to some extent.
Look at the OpenMPI FAQ:
http://www.open-mpi.org/faq/?category=tcp
http://www.open-mpi.org/faq/?category=tcp#tcp-selection


Many thanks,

Olivier Marsden
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] OpenMPI-1.2.7 + SGE

2008-11-04 Thread Reuti

Hi,

Am 04.11.2008 um 16:54 schrieb Sangamesh B:


Hi all,

 In Rocks-5.0 cluster, OpenMPI-1.2.6 comes by default. I guess it
gets installed through rpm.

# /opt/openmpi/bin/ompi_info | grep gridengine
 MCA ras: gridengine (MCA v1.0, API v1.3, Component  
v1.2.6)
 MCA pls: gridengine (MCA v1.0, API v1.3, Component  
v1.2.6)


 Now I've to install OpenMPI-1.2.7. The "./configure --help | grep
gridengine" - doesn't show anything.

 In such scenario how OpenMPI-1.2.7 can be integrated to SGE?


only for 1.3 it must be compiled with --with-sge, not in 1.2.x


After achieving this integration:

 1. Is it possible to use -machinefile option in the SGE script?

   Eg:
 #$ -pe orte 4

/opt/openmpi/bin/mpirun -machinefile $TMPDIR/machines -np
4 


You don't need this. Open MPI with use the correct cores on its own.  
Just specify: mpirun -np $NSLOTS mypgm




 2. If "qstat -f" is showing 2 slots on node1 and 2 slots on node2
for a 4 process openmpi job, then will these processes run exactly  on
those nodes?


qstat is only an output of what is granted to the job. With a bad  
configuration you could start all forks on the master node of the  
parallel job and leave the slaves idling. Open MPI will do the right  
thing on its own.


-- Reuti



# qconf -sp orte
pe_name   orte
slots 999
user_listsNONE
xuser_lists   NONE
start_proc_args   /bin/true
stop_proc_args/bin/true
allocation_rule   $fill_up
control_slavesTRUE
job_is_first_task FALSE
urgency_slots min


Thank you,
Sangamesh
Consultant - HPC
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] mca btl_openib_flags default value

2008-11-04 Thread Jeff Squyres
FWIW, we fixed this help message in the upcoming v1.3.  The new help  
message is:


mca:btl:openib:param:btl_openib_flags:help:BTL bit flags (general  
flags: SEND=1, PUT=2, GET=4, SEND_INPLACE=8; flags only used by the  
"dr" PML (ignored by others): ACK=16, CHECKSUM=32, RDMA_COMPLETION=128)


So 54 corresponds to PUT, GET, ACK, CHECKSUM (SEND is implied; IIRC  
it's somewhat silly that we have SEND as a flag because we assume that  
all BTL's can do it).


...although I see that the v1.3 message doesn't show the HETEROGENEOUS  
flag, which is 256.  /me goes to fix that...



On Nov 4, 2008, at 8:57 AM, Gilbert Grosdidier wrote:


Bonjour,

Working with OpenMPI 1.2.5 on RHEL5.2, I noticed a weird default  
value for

this mca parameter, as printed by ompi_info:

MCA btl: parameter "btl_openib_flags" (current value: "54")
BTL flags, added together: SEND=1, PUT=2, GET=4 (cannot be 0)

Is this expected or not ?
I could understand any value between 1 & 7, but what does mean 54,  
please ?

Does it behave like 6, if removal of the unexpected bits ?

Thanks,Gilbert

--
*-*
 Gilbert Grosdidier gilbert.grosdid...@in2p3.fr
 LAL / IN2P3 / CNRS Phone : +33 1 6446 8909
 Faculté des Sciences, Bat. 200 Fax   : +33 1 6446 8546
 B.P. 34, F-91898 Orsay Cedex (FRANCE)
-
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems




[OMPI users] OK, got it installed, but... can't find libraries

2008-11-04 Thread PattiMichelle
I went through the compile process with openMPI - twice, using g95 and
gfortran (the default install on my openSuSE11.0 setup).  It seems to
have trouble finding the libraries, in particular libopen-pal.so.0
I've seen shared-library problems with some x86_64 packages that I
contrib to SourceForge and I'm wondering if this is a known problem with
openMPI?  I'm using a TYAN 32-processor SMP machine with openSuSE11.0
installed.  (I tried copying the shared object file(s) to /usr/lib and
/usr/lib64)

This is STDERR output, the first time with g95 and then with gfortran:

linux-pouh:/usr/local/openmpi-1.2.8 # ./configure
FC=/usr/local/g95-install64bi/bin/x86_64-suse-linux-gnu-g95
--prefix=/usr/local/bin
configure: WARNING:  -fno-strict-aliasing has been added to CFLAGS
configure: WARNING:  -finline-functions has been added to CXXFLAGS
configure: WARNING: *** Did not find corresponding C type
configure: WARNING: *** Fortran 77 and Fortran 90 compilers are not link
compatible
configure: WARNING: *** Disabling MPI Fortran 90/95 bindings
configure: WARNING: On Linux and --with-udapl was not specified
configure: WARNING: Not building the udapl BTL
configure: WARNING: Unknown architecture ... proceeding anyway
configure: WARNING: File locks may not work with NFS.  See the
Installation and
users manual for instructions on testing and if necessary fixing this
linux-pouh:/usr/local/openmpi-1.2.8 # mpif90
mpif90: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-pouh:/usr/local/openmpi-1.2.8 #  

... now try gfortran ...

/usr/local/openmpi-1.2.8 # ./configure  --prefix=/usr/local/bin >
configure_STDIO.txt
configure: WARNING:  -fno-strict-aliasing has been added to CFLAGS
configure: WARNING:  -finline-functions has been added to CXXFLAGS
configure: WARNING: *** Did not find corresponding C type
configure: WARNING: *** Corresponding Fortran 77 type (INTEGER*16) not
supported
configure: WARNING: *** Skipping Fortran 90 type (INTEGER*16)
configure: WARNING: On Linux and --with-udapl was not specified
configure: WARNING: Not building the udapl BTL
configure: WARNING: Unknown architecture ... proceeding anyway
configure: WARNING: File locks may not work with NFS.  See the
Installation and
users manual for instructions on testing and if necessary fixing this
linux-pouh:/usr/local/openmpi-1.2.8 # make all install >
GFortMakeAllInstall_STDIO.txt
mpif90
libtool: install: warning: relinking `mca_maffinity_first_use.la'
libtool: install: warning: relinking `mca_maffinity_libnuma.la'
libtool: install: warning: relinking `mca_paffinity_linux.la'
libtool: install: warning: relinking `libopen-rte.la'



libtool: install: warning: relinking `mca_mpool_rdma.la'
libtool: install: warning: relinking `mca_mpool_sm.la'
libtool: install: warning: relinking `mca_pml_cm.la'
libtool: install: warning: relinking `mca_pml_ob1.la'
libtool: install: warning: relinking `mca_rcache_vma.la'
libtool: install: warning: relinking `mca_topo_unity.la'
linux-pouh:/usr/local/openmpi-1.2.8 # mpif90
mpif90: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-pouh:/usr/local/openmpi-1.2.8 #  
linux-pouh:/usr/local/openmpi-1.2.8 # cd /usr/local/lib
linux-pouh:/usr/local/lib # ls
libmca_common_sm.lalibmpi_cxx.solibmpi_f77.so.0 
libmpi.so.0.0.0   libopen-rte.la
libmca_common_sm.solibmpi_cxx.so.0  libmpi_f77.so.0.0.0 
libopen-pal.lalibopen-rte.so
libmca_common_sm.so.0  libmpi_cxx.so.0.0.0  libmpi.la   
libopen-pal.solibopen-rte.so.0
libmca_common_sm.so.0.0.0  libmpi_f77.lalibmpi.so   
libopen-pal.so.0  libopen-rte.so.0.0.0
libmpi_cxx.la  libmpi_f77.solibmpi.so.0 
libopen-pal.so.0.0.0  openmpi