Re: [OMPI users] Working with a CellBlade cluster

2008-10-28 Thread Gilbert Grosdidier
 You can easily map on a blade an application to run 
on CPU 0 (resp 1) while using memory banks relevant to CPU 0 (resp 1) with:

numactl --cpubind=0 --membind=0 app ...
(resp numactl --cpubind=1 --membind=1 app ...)

 Hope this helps,  Gilbert.

On Mon, 27 Oct 2008, Lenny Verkhovsky wrote:

> can you update me with the mapping or the way to get it from the OS on the
> Cell.
> 
> thanks
> 
> On Thu, Oct 23, 2008 at 8:08 PM, Mi Yan  wrote:
> 
> > Lenny,
> >
> > Thanks.
> > I asked the Cell/BE Linux Kernel developer to get the CPU mapping :) The
> > mapping is fixed in current kernel.
> >
> > Mi
> > [image: Inactive hide details for "Lenny Verkhovsky"
> > ]"Lenny Verkhovsky" <
> > lenny.verkhov...@gmail.com>
> >
> >
> >
> > *"Lenny Verkhovsky" *
> > Sent by: users-boun...@open-mpi.org
> >
> > 10/23/2008 01:52 PM Please respond to
> > Open MPI Users 
> >
> >
> > To
> >
> > "Open MPI Users" 
> > cc
> >
> >
> > Subject
> >
> > Re: [OMPI users] Working with a CellBlade cluster
> > According to *
> > https://svn.open-mpi.org/trac/ompi/milestone/Open%20MPI%201.3*very
> >  soon,
> > but you can download trunk version 
> > *http://www.open-mpi.org/svn/*and check if it 
> > works for you.
> >
> > how can you check mapping CPUs by OS , my cat /proc/cpuinfo shows very
> > little info
> > # cat /proc/cpuinfo
> > processor : 0
> > cpu : Cell Broadband Engine, altivec supported
> > clock : 3200.00MHz
> > revision : 48.0 (pvr 0070 3000)
> > processor : 1
> > cpu : Cell Broadband Engine, altivec supported
> > clock : 3200.00MHz
> > revision : 48.0 (pvr 0070 3000)
> > processor : 2
> > cpu : Cell Broadband Engine, altivec supported
> > clock : 3200.00MHz
> > revision : 48.0 (pvr 0070 3000)
> > processor : 3
> > cpu : Cell Broadband Engine, altivec supported
> > clock : 3200.00MHz
> > revision : 48.0 (pvr 0070 3000)
> > timebase : 2666
> > platform : Cell
> > machine : CHRP IBM,0793-1RZ
> >
> >
> >
> > On Thu, Oct 23, 2008 at 3:00 PM, Mi Yan 
> > <*mi...@us.ibm.com*>
> > wrote:
> >
> >Hi, Lenny,
> >
> >So rank file map will be supported in OpenMPI 1.3? I'm using
> >OpenMPI1.2.6 and did not find parameter "rmaps_rank_file_".
> >Do you have idea when OpenMPI 1.3 will be available? OpenMPI 1.3 has
> >quite a few features I'm looking for.
> >
> >Thanks,
> >
> >Mi
> >[image: Inactive hide details for "Lenny Verkhovsky"
> >]"Lenny Verkhovsky" <*
> >lenny.verkhov...@gmail.com* >
> >
> >
> >
> >
> >   *"Lenny Verkhovsky" 
> > <**lenny.verkhov...@gmail.com*
> > *>*
> > Sent by: 
> > *users-boun...@open-mpi.org*
> >
> > 10/23/2008 05:48 AM
> >
> > Please respond to
> > Open MPI Users <*us...@open-mpi.org* >
> >  To
> >
> > "Open MPI Users" <*us...@open-mpi.org* >cc
> > Subject
> >
> > Re: [OMPI users] Working with a CellBlade cluster
> >
> >
> >Hi,
> >
> >
> >If I understand you correctly the most suitable way to do it is by
> >paffinity that we have in Open MPI 1.3 and the trank.
> >how ever usually OS is distributing processes evenly between sockets by
> >it self.
> >
> >There still no formal FAQ due to a multiple reasons but you can read
> >how to use it in the attached scratch ( there were few name changings of 
> > the
> >params, so check with ompi_info )
> >
> >shared memory is used between processes that share same machine, and
> >openib is used between different machines ( hostnames ), no special mca
> >params are needed.
> >
> >Best Regards
> >Lenny,
> >
> >
> > On Sun, Oct 19, 2008 at 10:32 AM, Gilbert Grosdidier <*
> >gro...@mail.cern.ch* > wrote:
> >   Working with a CellBlade cluster (QS22), the requirement is to have
> >  one
> >  instance of the executable running on each socket of the blade
> >  (there are 2
> >  sockets). The application is of the 'domain decomposition' type,
> >  and each
> >  instance is required to often send/receive data with both the
> >  remote blades and
> >  the neighbor socket.
> >
> >  Question is : which specification must be used for the mca btl
> >  component
> >  to force 1) shmem type messages when communicating with this
> >  neighbor socket,
> >  while 2) using openib to communicate with the remote blades ?
> >  Is '-mca btl sm,openib,self' suitable for this ?
> >
> >  Also, which debug flags could be used to crosscheck that the
> >  messages are
> >  _actually_ going thru the right channel for a given channel,
> >  please ?
> >
> >  We are currently using OpenMPI 1.2.5 shipped with RHEL5.2
> >  (ppc64).
> >  Which version do you think is currently the most optimised for
> >  t

Re: [OMPI users] MPI_SUM and MPI_REAL16 with MPI_ALLREDUCE in fortran90

2008-10-28 Thread Julien Devriendt
Yes it is: REAL(kind=16) = REAL*16 = 16 byte REAL in fortran, or a 
long double in C that is why I thought MPI_REAL16 should work.


On Mon, 27 Oct 2008, Jeff Squyres wrote:

I dabble in Fortran but am not an expert -- is REAL(kind=16) the same as 
REAL*16?  MPI_REAL16 should be a 16 byte REAL; I'm not 100% sure that 
REAL(kind=16) is the same thing...?



On Oct 23, 2008, at 7:37 AM, Julien Devriendt wrote:



Hi,

I'm trying to do an MPI_ALLREDUCE with quadruple precision real and
MPI_SUM and open mpi does not give me the correct answer (vartemp
is equal to vartored instead of 2*vartored). Switching to double precision
real works fine.
My version of openmpi is 1.2.7 and it has been compiled with ifort v10.1
and icc/icpc at installation

Here's the simple f90 code which fails:

program test_quad

   implicit none

   include "mpif.h"


   real(kind=16) :: vartored(8),vartemp(8)
   integer   :: nn,nslaves,my_index
   integer   :: mpierror


   call MPI_INIT(mpierror)
   call MPI_COMM_SIZE(MPI_COMM_WORLD,nslaves,mpierror)
   call MPI_COMM_RANK(MPI_COMM_WORLD,my_index,mpierror)

   nn   = 8
   vartored = 1.0_16
   vartemp  = 0.0_16
   print*,"P1 ",my_index,vartored
   call MPI_ALLREDUCE 
(vartored,vartemp,nn,MPI_REAL16,MPI_SUM,MPI_COMM_WORLD,mpierror)

   print*,"P2 ",my_index,vartemp

   stop

end program test_quad

Any idea why this happens?

Many thanks in advance!

J.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] MPI_SUM and MPI_REAL16 with MPI_ALLREDUCE in fortran90

2008-10-28 Thread Julien Devriendt

Thanks for your suggestions.
I tried them all (declaring my variables as REAL*16 or REAL(16)) to no 
avail. I still get the wrong answer with my call to MPI_ALLREDUCE.


I think the KINDs are compiler dependent.  For Sun Studio Fortran, REAL*16 
and REAL(16) are the same thing.  For Intel, maybe it's different.  I don't 
know.  Try running this program:


double precision xDP
real(16) x16
real*16 xSTAR16
write(6,*) kind(xDP), kind(x16), kind(xSTAR16), kind(1.0_16)
end

and checking if the output matches your expectations.

Jeff Squyres wrote:

I dabble in Fortran but am not an expert -- is REAL(kind=16) the same  as 
REAL*16?  MPI_REAL16 should be a 16 byte REAL; I'm not 100% sure  that 
REAL(kind=16) is the same thing...?


On Oct 23, 2008, at 7:37 AM, Julien Devriendt wrote:


Hi,

I'm trying to do an MPI_ALLREDUCE with quadruple precision real and
MPI_SUM and open mpi does not give me the correct answer (vartemp
is equal to vartored instead of 2*vartored). Switching to double 
precision

real works fine.
My version of openmpi is 1.2.7 and it has been compiled with ifort  v10.1
and icc/icpc at installation

Here's the simple f90 code which fails:

program test_quad

   implicit none

   include "mpif.h"

   real(kind=16) :: vartored(8),vartemp(8)
   integer   :: nn,nslaves,my_index
   integer   :: mpierror

   call MPI_INIT(mpierror)
   call MPI_COMM_SIZE(MPI_COMM_WORLD,nslaves,mpierror)
   call MPI_COMM_RANK(MPI_COMM_WORLD,my_index,mpierror)

   nn   = 8
   vartored = 1.0_16
   vartemp  = 0.0_16
   print*,"P1 ",my_index,vartored
   call  MPI_ALLREDUCE 
(vartored,vartemp,nn,MPI_REAL16,MPI_SUM,MPI_COMM_WORLD,mpierror)

   print*,"P2 ",my_index,vartemp

   stop

end program test_quad

Any idea why this happens?



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] MPI_SUM and MPI_REAL16 with MPI_ALLREDUCE in fortran90

2008-10-28 Thread Julien Devriendt


Sorry, forgot to mention that running your sample program with ifort 
produces the expected result:


8 16 16 16



Thanks for your suggestions.
I tried them all (declaring my variables as REAL*16 or REAL(16)) to no avail. 
I still get the wrong answer with my call to MPI_ALLREDUCE.


I think the KINDs are compiler dependent.  For Sun Studio Fortran, REAL*16 
and REAL(16) are the same thing.  For Intel, maybe it's different.  I don't 
know.  Try running this program:


double precision xDP
real(16) x16
real*16 xSTAR16
write(6,*) kind(xDP), kind(x16), kind(xSTAR16), kind(1.0_16)
end

and checking if the output matches your expectations.

Jeff Squyres wrote:

I dabble in Fortran but am not an expert -- is REAL(kind=16) the same  as 
REAL*16?  MPI_REAL16 should be a 16 byte REAL; I'm not 100% sure  that 
REAL(kind=16) is the same thing...?


On Oct 23, 2008, at 7:37 AM, Julien Devriendt wrote:


Hi,

I'm trying to do an MPI_ALLREDUCE with quadruple precision real and
MPI_SUM and open mpi does not give me the correct answer (vartemp
is equal to vartored instead of 2*vartored). Switching to double 
precision

real works fine.
My version of openmpi is 1.2.7 and it has been compiled with ifort  v10.1
and icc/icpc at installation

Here's the simple f90 code which fails:

program test_quad

   implicit none

   include "mpif.h"

   real(kind=16) :: vartored(8),vartemp(8)
   integer   :: nn,nslaves,my_index
   integer   :: mpierror

   call MPI_INIT(mpierror)
   call MPI_COMM_SIZE(MPI_COMM_WORLD,nslaves,mpierror)
   call MPI_COMM_RANK(MPI_COMM_WORLD,my_index,mpierror)

   nn   = 8
   vartored = 1.0_16
   vartemp  = 0.0_16
   print*,"P1 ",my_index,vartored
   call  MPI_ALLREDUCE 
(vartored,vartemp,nn,MPI_REAL16,MPI_SUM,MPI_COMM_WORLD,mpierror)

   print*,"P2 ",my_index,vartemp

   stop

end program test_quad

Any idea why this happens?



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] MPI_SUM and MPI_REAL16 with MPI_ALLREDUCE in fortran90

2008-10-28 Thread Terry Frankcombe
I assume you've confirmed that point to point communication works
happily with quad prec on your machine?  How about one-way reductions?


On Tue, 2008-10-28 at 08:47 +, Julien Devriendt wrote:
> Thanks for your suggestions.
> I tried them all (declaring my variables as REAL*16 or REAL(16)) to no 
> avail. I still get the wrong answer with my call to MPI_ALLREDUCE.
> 
> > I think the KINDs are compiler dependent.  For Sun Studio Fortran, REAL*16 
> > and REAL(16) are the same thing.  For Intel, maybe it's different.  I don't 
> > know.  Try running this program:
> >
> > double precision xDP
> > real(16) x16
> > real*16 xSTAR16
> > write(6,*) kind(xDP), kind(x16), kind(xSTAR16), kind(1.0_16)
> > end
> >
> > and checking if the output matches your expectations.
> >
> > Jeff Squyres wrote:
> >
> >> I dabble in Fortran but am not an expert -- is REAL(kind=16) the same  as 
> >> REAL*16?  MPI_REAL16 should be a 16 byte REAL; I'm not 100% sure  that 
> >> REAL(kind=16) is the same thing...?
> >> 
> >> On Oct 23, 2008, at 7:37 AM, Julien Devriendt wrote:
> >> 
> >>> Hi,
> >>> 
> >>> I'm trying to do an MPI_ALLREDUCE with quadruple precision real and
> >>> MPI_SUM and open mpi does not give me the correct answer (vartemp
> >>> is equal to vartored instead of 2*vartored). Switching to double 
> >>> precision
> >>> real works fine.
> >>> My version of openmpi is 1.2.7 and it has been compiled with ifort  v10.1
> >>> and icc/icpc at installation
> >>> 
> >>> Here's the simple f90 code which fails:
> >>> 
> >>> program test_quad
> >>> 
> >>>implicit none
> >>> 
> >>>include "mpif.h"
> >>> 
> >>>real(kind=16) :: vartored(8),vartemp(8)
> >>>integer   :: nn,nslaves,my_index
> >>>integer   :: mpierror
> >>> 
> >>>call MPI_INIT(mpierror)
> >>>call MPI_COMM_SIZE(MPI_COMM_WORLD,nslaves,mpierror)
> >>>call MPI_COMM_RANK(MPI_COMM_WORLD,my_index,mpierror)
> >>> 
> >>>nn   = 8
> >>>vartored = 1.0_16
> >>>vartemp  = 0.0_16
> >>>print*,"P1 ",my_index,vartored
> >>>call  MPI_ALLREDUCE 
> >>> (vartored,vartemp,nn,MPI_REAL16,MPI_SUM,MPI_COMM_WORLD,mpierror)
> >>>print*,"P2 ",my_index,vartemp
> >>> 
> >>>stop
> >>> 
> >>> end program test_quad
> >>> 
> >>> Any idea why this happens?
> >> 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] MPI_SUM and MPI_REAL16 with MPI_ALLREDUCE in fortran90

2008-10-28 Thread Julien Devriendt


Yes point to point communication is OK with quad prec. and one-way 
reductions as well. I also tried my sample code on another platform 
(which sports AMD opterons instead of Intel CPUs) with the same compilers, 
and get the same *wrong* results with the call to MPI_ALLREDUCE in quad 
prec, so it does not seem to be a machine bug. Also altering a 
bit my sample code so as to replace MPI_SUM by MPI_MAX in the call to 
MPI_ALLREDUCE works perfectly well in quad prec !!!



On Tue, 28 Oct 2008, Terry Frankcombe wrote:


I assume you've confirmed that point to point communication works
happily with quad prec on your machine?  How about one-way reductions?


On Tue, 2008-10-28 at 08:47 +, Julien Devriendt wrote:

Thanks for your suggestions.
I tried them all (declaring my variables as REAL*16 or REAL(16)) to no
avail. I still get the wrong answer with my call to MPI_ALLREDUCE.


I think the KINDs are compiler dependent.  For Sun Studio Fortran, REAL*16
and REAL(16) are the same thing.  For Intel, maybe it's different.  I don't
know.  Try running this program:

double precision xDP
real(16) x16
real*16 xSTAR16
write(6,*) kind(xDP), kind(x16), kind(xSTAR16), kind(1.0_16)
end

and checking if the output matches your expectations.

Jeff Squyres wrote:


I dabble in Fortran but am not an expert -- is REAL(kind=16) the same  as
REAL*16?  MPI_REAL16 should be a 16 byte REAL; I'm not 100% sure  that
REAL(kind=16) is the same thing...?

On Oct 23, 2008, at 7:37 AM, Julien Devriendt wrote:


Hi,

I'm trying to do an MPI_ALLREDUCE with quadruple precision real and
MPI_SUM and open mpi does not give me the correct answer (vartemp
is equal to vartored instead of 2*vartored). Switching to double
precision
real works fine.
My version of openmpi is 1.2.7 and it has been compiled with ifort  v10.1
and icc/icpc at installation

Here's the simple f90 code which fails:

program test_quad

   implicit none

   include "mpif.h"

   real(kind=16) :: vartored(8),vartemp(8)
   integer   :: nn,nslaves,my_index
   integer   :: mpierror

   call MPI_INIT(mpierror)
   call MPI_COMM_SIZE(MPI_COMM_WORLD,nslaves,mpierror)
   call MPI_COMM_RANK(MPI_COMM_WORLD,my_index,mpierror)

   nn   = 8
   vartored = 1.0_16
   vartemp  = 0.0_16
   print*,"P1 ",my_index,vartored
   call  MPI_ALLREDUCE
(vartored,vartemp,nn,MPI_REAL16,MPI_SUM,MPI_COMM_WORLD,mpierror)
   print*,"P2 ",my_index,vartemp

   stop

end program test_quad

Any idea why this happens?



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Fwd: Problems installing in Cygwin

2008-10-28 Thread George Bosilca
It is complaining about a missing file. This is a file from the Open  
MPI distribution, I wonder how it can be missing. Can you verify that  
the file opal/mca/timer/windows/timer_windows_component.h is there ?


  Thanks,
george.



On Oct 27, 2008, at 4:52 PM, Jeff Squyres wrote:

Sorry for the lack of reply; several of us were at the MPI Forum  
meeting last week, and although I can't speak for everyone else, I  
know that I always fall [way] behind on e-mail when I travel.  :-\


The windows port is very much a work-in-progress.  I'm not surprised  
that it doesn't work.  :-\


The good folks at U. Stuttgart/HLRS are actively working on a real  
Windows port, but it's off in a side-branch right now.  I don't know  
the exact status of this port -- George / Rainer / Shiqing, can you  
comment?



On Oct 22, 2008, at 9:54 AM, Gustavo Seabra wrote:


Hi All,

(Sorry if you already got this message befor, but since I didn't get
any answer, I'm assuming it didn't get through to the list.)

I am trying to install OpenMPI in Cygwin. from a cygwin bash shell, I
configured OpenMPI with the command below:

$ echo $MPI_HOME
/home/seabra/local/openmpi-1.2.7
$ ./configure --prefix=$MPI_HOME \
  --with-mpi-param_check=always \
  --with-threads=posix \
  --enable-mpi-threads \
  --disable-io-romio \
  FC="g95" FFLAGS="-O0  -fno-second-underscore" \
  CXX="g++"

The configuration *seems* to be OK (it finishes with: "configure:  
exit
0"). However, when I try to install it, the installation finishes  
with
the error below. I wonder if anyone here could help me figure out  
what

is going wrong.

Thanks a lot!
Gustavo.

==
$ make clean
[...]
$ make install
[...]
Making install in mca/timer/windows
make[2]: Entering directory
`/home/seabra/local/openmpi-1.2.7/opal/mca/timer/windows'
depbase=`echo timer_windows_component.lo | sed 's|[^/]*$|.deps/&|;s| 
\.lo$||'`;\

  /bin/sh ../../../../libtool --tag=CC   --mode=compile gcc
-DHAVE_CONFIG_H -I. -I../../../../opal/include
-I../../../../orte/include -I../../../../ompi/include   -I../../../..
-D_REENTRANT  -O3 -DNDEBUG -finline-functions -fno-strict-aliasing
-MT timer_windows_component.lo -MD -MP -MF $depbase.Tpo -c -o
timer_windows_component.lo timer_windows_component.c &&\
  mv -f $depbase.Tpo $depbase.Plo
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../../../../opal/include
-I../../../../orte/include -I../../../../ompi/include -I../../../..
-D_REENTRANT -O3 -DNDEBUG -finline-functions -fno-strict-aliasing -MT
timer_windows_component.lo -MD -MP -MF
.deps/timer_windows_component.Tpo -c timer_windows_component.c
-DDLL_EXPORT -DPIC -o .libs/timer_windows_component.o
timer_windows_component.c:22:60:
opal/mca/timer/windows/timer_windows_component.h: No such file or
directory
timer_windows_component.c:25: error: parse error before
"opal_timer_windows_freq"
timer_windows_component.c:25: warning: data definition has no type or
storage class
timer_windows_component.c:26: error: parse error before
"opal_timer_windows_start"
timer_windows_component.c:26: warning: data definition has no type or
storage class
timer_windows_component.c: In function `opal_timer_windows_open':
timer_windows_component.c:60: error: `LARGE_INTEGER' undeclared  
(first

use in this function)
timer_windows_component.c:60: error: (Each undeclared identifier is
reported only once
timer_windows_component.c:60: error: for each function it appears  
in.)

timer_windows_component.c:60: error: parse error before "now"
timer_windows_component.c:62: error: `now' undeclared (first use in
this function)
make[2]: *** [timer_windows_component.lo] Error 1
make[2]: Leaving directory
`/home/seabra/local/openmpi-1.2.7/opal/mca/timer/windows'
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory `/home/seabra/local/openmpi-1.2.7/opal'
make: *** [install-recursive] Error 1
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




smime.p7s
Description: S/MIME cryptographic signature


[OMPI users] C++ Exceptions

2008-10-28 Thread Gabriele Fatigati
Dear OpenMPi developers,

i'm developing parallel C++ application  under OpenMPI 1.2.5. At the
moment, i'm using MPI Exception Handlers, but  some processors returns
the error below:

"MPI 2 C++ exception throwing is disabled, MPI::mpi_errno has the error code"

Why this, and why only in some nodes?

Thanks in advance,

-- 
Ing. Gabriele Fatigati

CINECA Systems & Tecnologies Department

Supercomputing  Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatig...@cineca.it


Re: [OMPI users] Fwd: Problems installing in Cygwin

2008-10-28 Thread Gustavo Seabra
On Tue, Oct 28, 2008 at 9:06 AM, George Bosilca wrote:
> It is complaining about a missing file. This is a file from the Open MPI
> distribution, I wonder how it can be missing. Can you verify that the file
> opal/mca/timer/windows/timer_windows_component.h is there ?

No, it's not. But I see a
opal/mca/timer/windows/timer_windows_component.c and a timer_windows.h
there. Should it have been generated at some point in the compilation?
The contents of that directory are only:

$ ls $HOME/local/openmpi-1.2.7/opal/mca/timer/windows/
Makefile  Makefile.am  Makefile.in  configure.m4  timer_windows.h
timer_windows_component.c

I don't know how relevant it is, but I downloaded the
"openmpi-1.2.7.tar.bz2" file (MD5SUM
b5ae3059fba71eba4a89a2923da8223f). Also, I'm trying to make a local
install, in $HOME/local/openmpi-1.2.7. Finally, as I may have
mentioned before, this installation is in Cygwin but, as far as I
understand, it shouldn't matter much since Cygwin these days can
reproduce a POSIX environment very well. Also, I don't quite
understand why is it looking for what it seems (to me) to be
windows-based function, since it is in a POSIX environment.

Thank you very much. Please let me know if there is any other
information I can provide to help track this.

All the best,

-- 
Gustavo Seabra
Postdoctoral Associate
Quantum Theory Project - University of Florida
Gainesville - Florida - USA


Re: [OMPI users] C++ Exceptions

2008-10-28 Thread Jeff Squyres
Your question is quite timely -- we had a long discussion about C++  
exceptions just last week at the MPI Forum...  :-)


OMPI disables MPI throwing exceptions by default because it can cause  
a [slight] performance penalty in some compilers.  You can enable it  
by adding --enable-cxx-exceptions to the OMPI configure command line.   
The issue is that C++ exceptions have to pass through C (and possibly  
Fortran) code, so the compiler has to add some extra instrumentation  
in each function call to all the exceptions to pass through (my  
understanding only, which may not be entirely correct).  Here's what  
happens:


  application (in C, C++, or Fortran)
-> calls MPI_Foo()
-> an error occurs, OMPI calls its error handling routines
-> if MPI::ERRORS_THROW_EXCEPTIONS was set, this triggers a  
function pointer call into libmpi_cxx.*
-> the underlying C++ function then invokes "throw ..." to throw  
the MPI exception

-> the exception leaves the C++ code and goes into OMPI's C code
-> the exception has to travel through the C code back up to the  
application
-> the exceptions it keeps going upward until it is either caught  
or the application aborts


Hence, you have to tell C and Fortran compilers to enable this "pass  
exceptions through" behavior.  With the GNU compilers, you have to  
specify -fexceptions when you compile C / Fortran codes.  There's a  
bug in the OMPI v1.2 series that we just discovered last week while  
doing 1.3 release testing (this is actually what triggered the long  
discussion and code fixes about C++ exceptions last week) such that  
you need to manually specify the exceptions flags for your compiler.   
Something like this:


  ./configure --enable-cxx-exceptions \
  CFLAGS=-fexceptions CXXFLAGS=-fexceptions FFLAGS=-fexceptions  
FCFLAGS=-fexceptions \

  --with-wrapper-cflags=-fexceptions \
  --with-wrapper-cxxflags=-fexceptions \
  --with-wrapper-fflags=-fexceptions \
  --with-wrapper-fcflags=-fexceptions \
  ...your other configure arguments...

In the v1.3 series, this is fixed such that you only need to specify:

  ./configure --enable-cxx-exceptions ...

...although in checking all the technical data for this e-mail, I  
found a mistake in our commits from last week on the SVN trunk; I just  
committed a fix in r19819 (sorry for the configure-changing commit in  
the middle of the day, folks!).  The v1.3 branch will be updated to  
get this fix shortly.


It is unlikely that we'll port this fix back to the 1.2 series, so  
you'll need to enable all the extra flags if you want exception support.


Hopefully that all made sense... :-)



On Oct 28, 2008, at 9:26 AM, Gabriele Fatigati wrote:


Dear OpenMPi developers,

i'm developing parallel C++ application  under OpenMPI 1.2.5. At the
moment, i'm using MPI Exception Handlers, but  some processors returns
the error below:

"MPI 2 C++ exception throwing is disabled, MPI::mpi_errno has the  
error code"


Why this, and why only in some nodes?

Thanks in advance,

--
Ing. Gabriele Fatigati

CINECA Systems & Tecnologies Department

Supercomputing  Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatig...@cineca.it
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] C++ Exceptions

2008-10-28 Thread Gabriele Fatigati
Very clear reply,
thanks Jeff :)

2008/10/28 Jeff Squyres :
> Your question is quite timely -- we had a long discussion about C++
> exceptions just last week at the MPI Forum...  :-)
>
> OMPI disables MPI throwing exceptions by default because it can cause a
> [slight] performance penalty in some compilers.  You can enable it by adding
> --enable-cxx-exceptions to the OMPI configure command line.  The issue is
> that C++ exceptions have to pass through C (and possibly Fortran) code, so
> the compiler has to add some extra instrumentation in each function call to
> all the exceptions to pass through (my understanding only, which may not be
> entirely correct).  Here's what happens:
>
>  application (in C, C++, or Fortran)
>-> calls MPI_Foo()
>-> an error occurs, OMPI calls its error handling routines
>-> if MPI::ERRORS_THROW_EXCEPTIONS was set, this triggers a function
> pointer call into libmpi_cxx.*
>-> the underlying C++ function then invokes "throw ..." to throw the MPI
> exception
>-> the exception leaves the C++ code and goes into OMPI's C code
>-> the exception has to travel through the C code back up to the
> application
>-> the exceptions it keeps going upward until it is either caught or the
> application aborts
>
> Hence, you have to tell C and Fortran compilers to enable this "pass
> exceptions through" behavior.  With the GNU compilers, you have to specify
> -fexceptions when you compile C / Fortran codes.  There's a bug in the OMPI
> v1.2 series that we just discovered last week while doing 1.3 release
> testing (this is actually what triggered the long discussion and code fixes
> about C++ exceptions last week) such that you need to manually specify the
> exceptions flags for your compiler.  Something like this:
>
>  ./configure --enable-cxx-exceptions \
>  CFLAGS=-fexceptions CXXFLAGS=-fexceptions FFLAGS=-fexceptions
> FCFLAGS=-fexceptions \
>  --with-wrapper-cflags=-fexceptions \
>  --with-wrapper-cxxflags=-fexceptions \
>  --with-wrapper-fflags=-fexceptions \
>  --with-wrapper-fcflags=-fexceptions \
>  ...your other configure arguments...
>
> In the v1.3 series, this is fixed such that you only need to specify:
>
>  ./configure --enable-cxx-exceptions ...
>
> ...although in checking all the technical data for this e-mail, I found a
> mistake in our commits from last week on the SVN trunk; I just committed a
> fix in r19819 (sorry for the configure-changing commit in the middle of the
> day, folks!).  The v1.3 branch will be updated to get this fix shortly.
>
> It is unlikely that we'll port this fix back to the 1.2 series, so you'll
> need to enable all the extra flags if you want exception support.
>
> Hopefully that all made sense... :-)
>
>
>
> On Oct 28, 2008, at 9:26 AM, Gabriele Fatigati wrote:
>
>> Dear OpenMPi developers,
>>
>> i'm developing parallel C++ application  under OpenMPI 1.2.5. At the
>> moment, i'm using MPI Exception Handlers, but  some processors returns
>> the error below:
>>
>> "MPI 2 C++ exception throwing is disabled, MPI::mpi_errno has the error
>> code"
>>
>> Why this, and why only in some nodes?
>>
>> Thanks in advance,
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> CINECA Systems & Tecnologies Department
>>
>> Supercomputing  Group
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.itTel:   +39 051 6171722
>>
>> g.fatig...@cineca.it
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



-- 
Ing. Gabriele Fatigati

CINECA Systems & Tecnologies Department

Supercomputing  Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatig...@cineca.it


Re: [OMPI users] C++ Exceptions

2008-10-28 Thread Gabriele Fatigati
Jeff,
another question: how can i check if MPI Exceptions are enabled?

2008/10/28 Jeff Squyres :
> Your question is quite timely -- we had a long discussion about C++
> exceptions just last week at the MPI Forum...  :-)
>
> OMPI disables MPI throwing exceptions by default because it can cause a
> [slight] performance penalty in some compilers.  You can enable it by adding
> --enable-cxx-exceptions to the OMPI configure command line.  The issue is
> that C++ exceptions have to pass through C (and possibly Fortran) code, so
> the compiler has to add some extra instrumentation in each function call to
> all the exceptions to pass through (my understanding only, which may not be
> entirely correct).  Here's what happens:
>
>  application (in C, C++, or Fortran)
>-> calls MPI_Foo()
>-> an error occurs, OMPI calls its error handling routines
>-> if MPI::ERRORS_THROW_EXCEPTIONS was set, this triggers a function
> pointer call into libmpi_cxx.*
>-> the underlying C++ function then invokes "throw ..." to throw the MPI
> exception
>-> the exception leaves the C++ code and goes into OMPI's C code
>-> the exception has to travel through the C code back up to the
> application
>-> the exceptions it keeps going upward until it is either caught or the
> application aborts
>
> Hence, you have to tell C and Fortran compilers to enable this "pass
> exceptions through" behavior.  With the GNU compilers, you have to specify
> -fexceptions when you compile C / Fortran codes.  There's a bug in the OMPI
> v1.2 series that we just discovered last week while doing 1.3 release
> testing (this is actually what triggered the long discussion and code fixes
> about C++ exceptions last week) such that you need to manually specify the
> exceptions flags for your compiler.  Something like this:
>
>  ./configure --enable-cxx-exceptions \
>  CFLAGS=-fexceptions CXXFLAGS=-fexceptions FFLAGS=-fexceptions
> FCFLAGS=-fexceptions \
>  --with-wrapper-cflags=-fexceptions \
>  --with-wrapper-cxxflags=-fexceptions \
>  --with-wrapper-fflags=-fexceptions \
>  --with-wrapper-fcflags=-fexceptions \
>  ...your other configure arguments...
>
> In the v1.3 series, this is fixed such that you only need to specify:
>
>  ./configure --enable-cxx-exceptions ...
>
> ...although in checking all the technical data for this e-mail, I found a
> mistake in our commits from last week on the SVN trunk; I just committed a
> fix in r19819 (sorry for the configure-changing commit in the middle of the
> day, folks!).  The v1.3 branch will be updated to get this fix shortly.
>
> It is unlikely that we'll port this fix back to the 1.2 series, so you'll
> need to enable all the extra flags if you want exception support.
>
> Hopefully that all made sense... :-)
>
>
>
> On Oct 28, 2008, at 9:26 AM, Gabriele Fatigati wrote:
>
>> Dear OpenMPi developers,
>>
>> i'm developing parallel C++ application  under OpenMPI 1.2.5. At the
>> moment, i'm using MPI Exception Handlers, but  some processors returns
>> the error below:
>>
>> "MPI 2 C++ exception throwing is disabled, MPI::mpi_errno has the error
>> code"
>>
>> Why this, and why only in some nodes?
>>
>> Thanks in advance,
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> CINECA Systems & Tecnologies Department
>>
>> Supercomputing  Group
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.itTel:   +39 051 6171722
>>
>> g.fatig...@cineca.it
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



-- 
Ing. Gabriele Fatigati

CINECA Systems & Tecnologies Department

Supercomputing  Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatig...@cineca.it


Re: [OMPI users] C++ Exceptions

2008-10-28 Thread Jeff Squyres

On Oct 28, 2008, at 11:19 AM, Gabriele Fatigati wrote:


another question: how can i check if MPI Exceptions are enabled?


ompi_info | grep exceptions

Should tell ya.

--
Jeff Squyres
Cisco Systems



Re: [OMPI users] MPI_SUM and MPI_REAL16 with MPI_ALLREDUCE in fortran90

2008-10-28 Thread Jeff Squyres
Something odd is definitely going on here.  I'm able to replicate your  
problem with the intel compiler suite, but I can't quite figure out  
why -- it all works properly if I convert the app to C (and still use  
the MPI_REAL16 datatype with long double data).


George and I are investigating; I've opened a ticket on this: 
https://svn.open-mpi.org/trac/ompi/ticket/1603


On Oct 28, 2008, at 6:35 AM, Julien Devriendt wrote:



Yes point to point communication is OK with quad prec. and one-way  
reductions as well. I also tried my sample code on another platform  
(which sports AMD opterons instead of Intel CPUs) with the same  
compilers, and get the same *wrong* results with the call to  
MPI_ALLREDUCE in quad prec, so it does not seem to be a machine bug.  
Also altering a bit my sample code so as to replace MPI_SUM by  
MPI_MAX in the call to MPI_ALLREDUCE works perfectly well in quad  
prec !!!



On Tue, 28 Oct 2008, Terry Frankcombe wrote:


I assume you've confirmed that point to point communication works
happily with quad prec on your machine?  How about one-way  
reductions?



On Tue, 2008-10-28 at 08:47 +, Julien Devriendt wrote:

Thanks for your suggestions.
I tried them all (declaring my variables as REAL*16 or REAL(16))  
to no

avail. I still get the wrong answer with my call to MPI_ALLREDUCE.

I think the KINDs are compiler dependent.  For Sun Studio  
Fortran, REAL*16
and REAL(16) are the same thing.  For Intel, maybe it's  
different.  I don't

know.  Try running this program:

double precision xDP
real(16) x16
real*16 xSTAR16
write(6,*) kind(xDP), kind(x16), kind(xSTAR16), kind(1.0_16)
end

and checking if the output matches your expectations.

Jeff Squyres wrote:

I dabble in Fortran but am not an expert -- is REAL(kind=16) the  
same  as
REAL*16?  MPI_REAL16 should be a 16 byte REAL; I'm not 100%  
sure  that

REAL(kind=16) is the same thing...?

On Oct 23, 2008, at 7:37 AM, Julien Devriendt wrote:


Hi,

I'm trying to do an MPI_ALLREDUCE with quadruple precision real  
and

MPI_SUM and open mpi does not give me the correct answer (vartemp
is equal to vartored instead of 2*vartored). Switching to double
precision
real works fine.
My version of openmpi is 1.2.7 and it has been compiled with  
ifort  v10.1

and icc/icpc at installation

Here's the simple f90 code which fails:

program test_quad

  implicit none

  include "mpif.h"

  real(kind=16) :: vartored(8),vartemp(8)
  integer   :: nn,nslaves,my_index
  integer   :: mpierror

  call MPI_INIT(mpierror)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,nslaves,mpierror)
  call MPI_COMM_RANK(MPI_COMM_WORLD,my_index,mpierror)

  nn   = 8
  vartored = 1.0_16
  vartemp  = 0.0_16
  print*,"P1 ",my_index,vartored
  call  MPI_ALLREDUCE
(vartored,vartemp,nn,MPI_REAL16,MPI_SUM,MPI_COMM_WORLD,mpierror)
  print*,"P2 ",my_index,vartemp

  stop

end program test_quad

Any idea why this happens?



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems