Re: [OMPI users] MPI_COMPLEX16

2012-04-26 Thread David Singleton


I should have checked earlier - same for MPI_COMPLEX and MPI_COMPLEX8.

David

On 04/27/2012 08:43 AM, David Singleton wrote:


Apologies if this has already been covered somewhere. One of our users
has noticed that MPI_COMPLEX16 is flagged as an invalid type in 1.5.4
but not in 1.4.3 while MPI_DOUBLE_COMPLEX is accepted for both. This is
with either gfortran or intel-fc. Superficially, the configure looks
the same for 1.4.3 and 1.5.4, eg.
% grep COMPLEX16 opal/include/opal_config.h
#define OMPI_HAVE_F90_COMPLEX16 1
#define OMPI_HAVE_FORTRAN_COMPLEX16 1

Their test code (appended below) produces:

% module load openmpi/1.4.3
% mpif90 mpi_complex_test.f90
% mpirun -np 2 ./a.out
SUM1 (3.00,-1.00)
SUM2 (3.00,-1.00)
% module swap openmpi/1.5.4
% mpif90 mpi_complex_test.f90
% mpirun -np 2 ./a.out
[vayu1:1935] *** An error occurred in MPI_Reduce
[vayu1:1935] *** on communicator MPI_COMM_WORLD
[vayu1:1935] *** MPI_ERR_TYPE: invalid datatype
[vayu1:1935] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
SUM1 (3.00,-1.00)

Thanks for any help,
David


program mpi_test

implicit none
include 'mpif.h'
integer, parameter :: r8 = selected_real_kind(12)
complex(kind=r8) :: local, global
integer :: ierr, myid, nproc

call MPI_INIT (ierr)
call MPI_COMM_RANK (MPI_COMM_WORLD, myid, ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, nproc, ierr)

local = cmplx(myid+1.0, myid-1.0, kind=r8)
call MPI_REDUCE (local, global, 1, MPI_DOUBLE_COMPLEX, MPI_SUM, 0, &
MPI_COMM_WORLD, ierr)
if ( myid == 0 ) then
print*, 'SUM1', global
end if

call MPI_REDUCE (local, global, 1, MPI_COMPLEX16, MPI_SUM, 0, &
MPI_COMM_WORLD, ierr)
if ( myid == 0 ) then
print*, 'SUM2', global
end if

call MPI_FINALIZE (ierr)

end program mpi_test



[OMPI users] MPI_COMPLEX16

2012-04-26 Thread David Singleton


Apologies if this has already been covered somewhere.  One of our users
has noticed that MPI_COMPLEX16 is flagged as an invalid type in 1.5.4
but not in 1.4.3 while MPI_DOUBLE_COMPLEX is accepted for both. This is
with either gfortran or intel-fc.  Superficially, the configure looks
the same for 1.4.3 and 1.5.4,  eg.
% grep COMPLEX16  opal/include/opal_config.h
#define OMPI_HAVE_F90_COMPLEX16 1
#define OMPI_HAVE_FORTRAN_COMPLEX16 1

Their test code (appended below) produces:

% module load openmpi/1.4.3
% mpif90 mpi_complex_test.f90
% mpirun -np 2 ./a.out
 SUM1 (3.00,-1.00)
 SUM2 (3.00,-1.00)
% module swap openmpi/1.5.4
% mpif90 mpi_complex_test.f90
% mpirun -np 2 ./a.out
[vayu1:1935] *** An error occurred in MPI_Reduce
[vayu1:1935] *** on communicator MPI_COMM_WORLD
[vayu1:1935] *** MPI_ERR_TYPE: invalid datatype
[vayu1:1935] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
 SUM1 (3.00,-1.00)

Thanks for any help,
David


program mpi_test

   implicit none
   include 'mpif.h'
   integer, parameter :: r8 = selected_real_kind(12)
   complex(kind=r8) :: local, global
   integer :: ierr, myid, nproc

   call MPI_INIT (ierr)
   call MPI_COMM_RANK (MPI_COMM_WORLD, myid, ierr)
   call MPI_COMM_SIZE (MPI_COMM_WORLD, nproc, ierr)

   local = cmplx(myid+1.0, myid-1.0, kind=r8)
   call MPI_REDUCE (local, global, 1, MPI_DOUBLE_COMPLEX, MPI_SUM, 0, &
MPI_COMM_WORLD, ierr)
   if ( myid == 0 ) then
  print*, 'SUM1', global
   end if

   call MPI_REDUCE (local, global, 1, MPI_COMPLEX16, MPI_SUM, 0, &
MPI_COMM_WORLD, ierr)
   if ( myid == 0 ) then
  print*, 'SUM2', global
   end if

   call MPI_FINALIZE (ierr)

end program mpi_test




Re: [OMPI users] regarding the problem occurred while running anmpi programs

2012-04-26 Thread Prentice Bisbal
Actually, he should leave the ":$LD_LIBRARY_PATH" on the end. That way
if LD_LIBRARY_PATH is already defined, the Open MPI directory is just
prepended to LD_LIBRARY_PATH. Omitting ":$LD_LIBRARY_PATH" from his
command could cause other needed elements of LD_LIBRARY_PATH to be lost,
causing other runtime errors.

--
Prentice



On 04/25/2012 11:48 AM, tyler.bal...@huskers.unl.edu wrote:
> export LD_LIBRARY_PATH= [location of library] leave out
> the :$LD_LIBRARY_PATH 
> 
> *From:* users-boun...@open-mpi.org [users-boun...@open-mpi.org] on
> behalf of seshendra seshu [seshu...@gmail.com]
> *Sent:* Wednesday, April 25, 2012 10:43 AM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] regarding the problem occurred while
> running anmpi programs
>
> Hi
> I have exported the library files as below
>
> [master@ip-10-80-106-70 ~]$ export
> LD_LIBRARY_PATH=/usr/local/openmpi-1.4.5/lib:$LD_LIBRARY_PATH 
>   
> [master@ip-10-80-106-70 ~]$ mpirun --prefix /usr/local/openmpi-1.4.5
> -n 1 --hostfile hostfile out
> out: error while loading shared libraries: libmpi_cxx.so.0: cannot
> open shared object file: No such file or directory
> [master@ip-10-80-106-70 ~]$ mpirun --prefix /usr/local/lib/ -n 1
> --hostfile hostfile
> out   
> 
> out: error while loading shared libraries: libmpi_cxx.so.0: cannot
> open shared object file: No such file or directory
>
> But still iam getting the same error.
>
>
>
>
>
> On Wed, Apr 25, 2012 at 5:36 PM, Jeff Squyres (jsquyres)
> mailto:jsquy...@cisco.com>> wrote:
>
> See the FAQ item I cited. 
>
> Sent from my phone. No type good. 
>
> On Apr 25, 2012, at 11:24 AM, "seshendra seshu"
> mailto:seshu...@gmail.com>> wrote:
>
>> Hi
>> now i have created an used and tried to run the program but i got
>> the following error
>>
>> [master@ip-10-80-106-70 ~]$ mpirun -n 1 --hostfile hostfile
>> out  
>>   
>> out: error while loading shared libraries: libmpi_cxx.so.0:
>> cannot open shared object file: No such file or directory
>>
>>
>> thanking you
>>
>>
>>
>> On Wed, Apr 25, 2012 at 5:12 PM, Jeff Squyres > > wrote:
>>
>> On Apr 25, 2012, at 11:06 AM, seshendra seshu wrote:
>>
>> > so should i need to create an user and run the mpi program.
>> or how can i run in cluster
>>
>> It is a "best practice" to not run real applications as root
>> (e.g., MPI applications).  Create a non-privlidged user to
>> run your applications.
>>
>> Then be sure to set your LD_LIBRARY_PATH if you installed
>> Open MPI into a non-system-default location.  See this FAQ item:
>>
>>  
>>  http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com 
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org 
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>>
>> -- 
>>  WITH REGARDS
>> M.L.N.Seshendra
>> ___
>> users mailing list
>> us...@open-mpi.org 
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org 
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
> -- 
>  WITH REGARDS
> M.L.N.Seshendra
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

2012-04-26 Thread TERRY DONTJE



On 4/25/2012 1:00 PM, Jeff Squyres wrote:

On Apr 25, 2012, at 12:51 PM, Ralph Castain wrote:


Sounds rather bizarre. Do you have lstopo on your machine? Might be useful to 
see the output of that so we can understand what it thinks the topology is like 
as this underpins the binding code.

The -nooversubscribe option is a red herring here - it has nothing to do with 
the problem, nor will it help.

FWIW: if you aren't adding --bind-to-core, then OMPI isn't launching your process on any 
specific core at all - we are simply launching it on the node. It sounds to me like your 
code is incorrectly identifying "sharing" when a process isn't bound to a 
specific core.

+1

Put differently: if you're not binding your processes to processor cores, then 
it's quite likely/possible that multiple processes *are* running on the same 
processor cores, at least intermittently, because the OS is allowed to migrate 
processes to whatever processor cores it wants to.
However, Kyle mentioned previously that he was doing a -bind-to-core 
option.  I would suggest adding -report-bindings to the mpirun command 
line and see what mpirun really thinks it is binding to if it is at all.


There is one piece of information that seems missing and confusing me.  
Kyle how is your code determining it is the only process bound to a core 
or conversely another process is bound to the same core?


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com