Re: [OMPI users] Missing pointer in MPI_Request / MPI_Ibarrier in documentation for 1.10.0

2015-09-28 Thread Harald Servat

Hello Gilles,

  the webpages I pointed in the original mail and which are the 
official open-mpi.org, miss the * in the declaration of MPI_Ibarrier, 
aren't they?


See:

C Syntax

#include 
int MPI_Barrier(MPI_Comm comm)
int MPI_Ibarrier(MPI_Comm comm, MPI_Request request)
   ^-> shouldn't be a * there?

Best,

On 09/28/2015 02:21 AM, Gilles Gouaillardet wrote:

Harald,

could you be more specific ?
btw, do you check the www.open-mpi.org main site or a mirror ?

the man pages looks good to me, and the issue you described was fixed
one month ago.

Cheers,

Gilles

On 9/25/2015 8:07 PM, Harald Servat wrote:

Dear all,

  I'd like to note you that the manual pages for the C-syntax
MPI_Ibarrier in OpenMPI v1.10.0 misses the pointer in the MPI_Request.

See:

  https://www.open-mpi.org/doc/v1.10/man3/MPI_Ibarrier.3.php
  https://www.open-mpi.org/doc/v1.10/man3/MPI_Barrier.3.php

Best,

WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/09/27677.php



___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/09/27689.php


WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer


Re: [OMPI users] Missing pointer in MPI_Request / MPI_Ibarrier in documentation for 1.10.0

2015-09-28 Thread Gilles Gouaillardet

Harald,

thanks for the clarification, i clearly missed that !

i will fix it from now

Cheers,

Gilles

On 9/28/2015 4:49 PM, Harald Servat wrote:

Hello Gilles,

  the webpages I pointed in the original mail and which are the 
official open-mpi.org, miss the * in the declaration of MPI_Ibarrier, 
aren't they?


See:

C Syntax

#include 
int MPI_Barrier(MPI_Comm comm)
int MPI_Ibarrier(MPI_Comm comm, MPI_Request request)
   ^-> shouldn't be a * there?

Best,

On 09/28/2015 02:21 AM, Gilles Gouaillardet wrote:

Harald,

could you be more specific ?
btw, do you check the www.open-mpi.org main site or a mirror ?

the man pages looks good to me, and the issue you described was fixed
one month ago.

Cheers,

Gilles

On 9/25/2015 8:07 PM, Harald Servat wrote:

Dear all,

  I'd like to note you that the manual pages for the C-syntax
MPI_Ibarrier in OpenMPI v1.10.0 misses the pointer in the MPI_Request.

See:

  https://www.open-mpi.org/doc/v1.10/man3/MPI_Ibarrier.3.php
  https://www.open-mpi.org/doc/v1.10/man3/MPI_Barrier.3.php

Best,

WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/09/27677.php



___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/09/27689.php


WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/09/27692.php






Re: [OMPI users] Invalid read of size 4 (Valgrind error) with OpenMPI 1.8.7

2015-09-28 Thread Schlottke-Lakemper, Michael
Sorry for the long delay.

Unfortunately, I am no longer able to reproduce the Valgrind errors I reported 
earlier with either the debug version or the normally-compiled version of  OMPI 
1.8.7. I don’t know what happened - probably some change to our cluster 
infrastructure that I am not aware of and that I am not able to track down. 
Sorry for having wasted your collective time on this; if this error should 
arise again, I will try to get a proper Valgrind report with -enable-debug and 
report it here.

Michael

> On 30 Jul 2015, at 22:10 , Nathan Hjelm  wrote:
> 
> 
> I agree with Ralph. Please run again with --enable-debug. That will give
> more information (line number) on where the error is occuring.
> 
> Looking at the function in question the only place I see that could be
> causing this warning is the call to strlen. Some implementations of
> strlen use operate on larger chunks (4 or 8 bytes). This will make
> valgrind unhappy but does not make the implementation invalid as no read
> will cross a page boundary (so no SEGV). One example of such a strlen
> implementation is the one used by icc which uses vector operations on
> 8-byte chunks of the string.
> 
> -Nathan
> 
> On Wed, Jul 29, 2015 at 07:58:09AM -0700, Ralph Castain wrote:
>>   If you have the time, it would be helpful. You might also configure
>>   -enable-debug.
>>   Meantime, I can take another gander to see how it could happen - looking
>>   at the code, it sure seems impossible, but maybe there is some strange
>>   path that would break it.
>> 
>> On Jul 29, 2015, at 6:29 AM, Schlottke-Lakemper, Michael
>>  wrote:
>> If it is helpful, I can try to compile OpenMPI with debug information
>> and get more details on the reported error. However, it would be good if
>> someone could tell me the necessary compile flags (on top of -O0 -g) and
>> it would take me probably 1-2 weeks to do it.
>> Michael
>> 
>>  Original message 
>> From: Gilles Gouaillardet 
>> Date: 29/07/2015 14:17 (GMT+01:00)
>> To: Open MPI Users 
>> Subject: Re: [OMPI users] Invalid read of size 4 (Valgrind error) with
>> OpenMPI 1.8.7
>> 
>> Thomas,
>> can you please elaborate ?
>> I checked the code of opal_os_dirpath_create and could not find where
>> such a thing can happen
>> Thanks,
>> Gilles
>> On Wednesday, July 29, 2015, Thomas Jahns  wrote:
>> 
>>   Hello,
>> 
>>   On 07/28/15 17:34, Schlottke-Lakemper, Michael wrote:
>> 
>> That's what I suspected. Thank you for your confirmation.
>> 
>>   you are mistaken, the allocation is 51 bytes long, i.e. valid bytes
>>   are at offsets 0 to 50. But since the read of 4 bytes starts at offset
>>   48, the bytes at offsets 48, 49, 50 and 51 get read, the last of which
>>   is illegal. It probably does no harm at the moment in practice,
>>   because virtually all allocators always add some padding to the next
>>   multiple of some power of 2. But still this means the program is
>>   incorrect in terms of any programming language definition involved
>>   (might be C, C++ or Fortran).
>> 
>>   Regards, Thomas
>> 
>>   On 25 Jul 2015, at 16:10 , Ralph Castain >   > wrote:
>> 
>>   Looks to me like a false positive - we do malloc some space, and
>>   do access
>>   different parts of it. However, it looks like we are inside the
>>   space at all
>>   times.
>> 
>>   I'd suppress it
>> 
>> On Jul 23, 2015, at 12:47 AM, Schlottke-Lakemper, Michael
>> > > wrote:
>> 
>> Hi folks,
>> 
>> recently we've been getting a Valgrind error in PMPI_Init for
>> our suite of
>> regression tests:
>> 
>> ==5922== Invalid read of size 4
>> ==5922==at 0x61CC5C0: opal_os_dirpath_create (in
>> /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2)
>> ==5922==by 0x5F207E5: orte_session_dir (in
>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>> ==5922==by 0x5F34F04: orte_ess_base_app_setup (in
>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>> ==5922==by 0x7E96679: rte_init (in
>> /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so)
>> ==5922==by 0x5F12A77: orte_init (in
>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>> ==5922==by 0x509883C: ompi_mpi_init (in
>> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
>> ==5922==by 0x50B843A: PMPI_Init (in
>> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
>> ==5922==by 0xEBA79C: ZFS::run() (in
>> 
>> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
>> ==5922==  

[OMPI users] Need some help to track problem using openmpi

2015-09-28 Thread Sven Schumacher
Hello,

I've set up our new cluster using Infiniband using a combination of:
Debian, Torque/Maui, BeeGeeFS (formerly FHGFS)

Every node has two infiniband-ports, both of them having an IP-Adress.
One port shall be used for BeeGeeFS (which is working well) and the
other one for MPI-Communication.

I'm using openmpi in version 1.8.5, compiled with gcc/gfortran 4.9.2 and
ibverbs support.
Configure command was the following:

Output of "ompi_info --parsable  -a -c" is attached as txt-file (all
nodes are configured the same)


The following infiniband-related kernel-modules are loaded:
> mlx4_core 206165  1 mlx4_ib
> rdma_ucm   22055  0
> ib_uverbs  44693  1 rdma_ucm
> rdma_cm39518  2 ib_iser,rdma_ucm
> iw_cm  31011  1 rdma_cm
> ib_umad17311  0
> mlx4_ib   136293  0
> ib_cm  39055  3 rdma_cm,ib_srp,ib_ipoib
> ib_sa  26986  6
> rdma_cm,ib_cm,mlx4_ib,ib_srp,rdma_ucm,ib_ipoib
> ib_mad 39969  4 ib_cm,ib_sa,mlx4_ib,ib_umad
> ib_core68904  12
> rdma_cm,ib_cm,ib_sa,iw_cm,mlx4_ib,ib_mad,ib_srp,ib_iser,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
> ib_addr17148  3 rdma_cm,ib_core,rdma_ucm
> ib_iser44204  0
> iscsi_tcp  17580  0
> libiscsi_tcp   21554  1 iscsi_tcp
> libiscsi   48004  3 libiscsi_tcp,iscsi_tcp,ib_iser
> scsi_transport_iscsi77478  4 iscsi_tcp,ib_iser,libiscsi
> ib_ipoib   85167  0
> ib_srp 39710  0
> scsi_transport_srp 18194  1 ib_srp
> scsi_tgt   17698  1 scsi_transport_srp

When using mpiexec to execute a job running on a single node using 8
cores everything works fine, but when mpiexec has to start a second
process on another node it doesn't start that process.
What I already did:

Testing ssh-logins: Works (without a password using ssh-keys).
Testing name-resolution: works

Used a "hello Word"-mpi-Program:
> #include 
> #include 
>
> int main(int argc, char** argv) {
> // Initialize the MPI environment
> MPI_Init(NULL, NULL);
>
> // Get the number of processes
> int world_size;
> MPI_Comm_size(MPI_COMM_WORLD, &world_size);
>
> // Get the rank of the process
> int world_rank;
> MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
>
> // Get the name of the processor
> char processor_name[MPI_MAX_PROCESSOR_NAME];
> int name_len;
> MPI_Get_processor_name(processor_name, &name_len);
>
> // Print off a hello world message
> printf("Hello world from processor %s, rank %d"
>" out of %d processors\n",
>processor_name, world_rank, world_size);
>
> // Finalize the MPI environment.
> MPI_Finalize();
> }


This throws an error (on a single node it produces the following error
messages, but doesn't produce any output , when run on two nodes):
> [hydra001:20324] 1 more process has sent help message
> help-mpi-btl-openib-cpc-base.txt / no cpcs for port
> [hydra001:20324] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages

> --
> No OpenFabrics connection schemes reported that they were able to be
> used on a specific port.  As such, the openib BTL (OpenFabrics
> support) will be disabled for this port.
>
>   Local host:   hydra001
>   Local device: mlx4_0
>   Local port:   1
>   CPCs attempted:   udcm
> --
> Hello world from processor hydra001, rank 0 out of 1 processors

So, where can I find a documented list of all these MCA parameters? It
doesn't seem there is such a list on open-mpi.org or I didn't find it...
so thanks in advance for directing me to right place

Sven Schumacher






-- 
Sven Schumacher - Systemadministrator Tel: (0511)762-2753
Leibniz Universitaet Hannover
Institut für Turbomaschinen und Fluid-Dynamik   - TFD
Appelstraße 9 - 30167 Hannover
Institut für Kraftwerkstechnik und Wärmeübertragung - IKW
Callinstraße 36 - 30167 Hannover

package:Open MPI root@ikarus Distribution
ompi:version:full:1.8.5
ompi:version:repo:v1.8.4-333-g039fb11
ompi:version:release_date:May 05, 2015
orte:version:full:1.8.5
orte:version:repo:v1.8.4-333-g039fb11
orte:version:release_date:May 05, 2015
opal:version:full:1.8.5
opal:version:repo:v1.8.4-333-g039fb11
opal:version:release_date:May 05, 2015
mpi-api:version:full:3.0
ident:1.8.5
path:prefix:/sw/mpi/openmpi/1.8.5-gnu_4.9.2_ohneIB
path:exec_prefix:/sw/mpi/openmpi/1.8.5-gnu_4.9.2_ohneIB
path:bindir:/sw/mpi/openmpi/1.8.5-gnu_4.9.2_ohneIB/bin
path:sbindir:/sw/mpi/openmpi/1.8.5-gnu_4.9.2_ohneIB/sbin
path:libdir:/sw/mpi/openmpi/1.8.5-gnu_4.9.2_ohneIB/lib
path:incdir:/sw/mpi/openmpi/1.8.5-gnu_4.9.2_ohneIB/include
path:mandir:/sw/mpi/openmpi/1.8.5-gnu_4.9.2_ohneIB/share/man
path:pkglibdir:/sw/mpi/openmpi/1.8.5-gnu_4.9.2_ohneIB/lib/openmpi

Re: [OMPI users] possible GATS bug in osc/sm

2015-09-28 Thread Steffen Christgau
Hi Nathan,

On 23.09.2015 00:24, Nathan Hjelm wrote:
> I think I have the problem fixed. I went with a bitmap approach but I
> don't think that will scale well as node sizes increase since it
> requires n^2 bits to implement the post table. When I have time I will
> implement the approach used in osc/rdma in osc/sm.

Thanks for the fix. Looks good, but I've not tested it yet. I'm going to
do that as soon as possible.

Your scaling concerns are true. However, n^2 _bits_ makes 8 kB for a 256
core shared memory system, which I found is still acceptable.

Regards, Steffen


[OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-28 Thread Grigory Shamov
Hi All,


We have built OpenMPI (1.8.8., 1.10.0) against Mellanox OFED 2.4 and
corresponding MXM. When it runs now, it gives the following warning, per
process:

[1443457390.911053] [myhist:5891 :0] mxm.c:185  MXM  WARN  The
'ulimit -s' on the system is set to 'unlimited'. This may have negative
performance implications. Please set the heap size to the default value
(10240)

We have ulimits for heap (as well as most of the other limits) set
unlimited because of applications that might possibly need a lot of RAM.

The question is if we should do as MXM wants, or ignore it? Has anyone an
experience running recent OpenMPI with MXM enabled, and what kind of
ulimits do you have? Any suggestions/comments appreciated, thanks!


-- 
Grigory Shamov

Westgrid/ComputeCanada Site Lead
University of Manitoba
E2-588 EITC Building,
(204) 474-9625





Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-28 Thread Mike Dubman
Hello Grigory,

We observed ~10% performance degradation with heap size set to unlimited
for CFD applications.

You can measure your application performance with default and unlimited
"limits" and select the best setting.

Kind Regards.
M

On Mon, Sep 28, 2015 at 7:36 PM, Grigory Shamov  wrote:

> Hi All,
>
>
> We have built OpenMPI (1.8.8., 1.10.0) against Mellanox OFED 2.4 and
> corresponding MXM. When it runs now, it gives the following warning, per
> process:
>
> [1443457390.911053] [myhist:5891 :0] mxm.c:185  MXM  WARN  The
> 'ulimit -s' on the system is set to 'unlimited'. This may have negative
> performance implications. Please set the heap size to the default value
> (10240)
>
> We have ulimits for heap (as well as most of the other limits) set
> unlimited because of applications that might possibly need a lot of RAM.
>
> The question is if we should do as MXM wants, or ignore it? Has anyone an
> experience running recent OpenMPI with MXM enabled, and what kind of
> ulimits do you have? Any suggestions/comments appreciated, thanks!
>
>
> --
> Grigory Shamov
>
> Westgrid/ComputeCanada Site Lead
> University of Manitoba
> E2-588 EITC Building,
> (204) 474-9625
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/09/27697.php
>



-- 

Kind Regards,

M.


Re: [OMPI users] Need some help to track problem using openmpi

2015-09-28 Thread Sven Schumacher
Sorry, I attached the wrong output of ompi_info... this one is the right
one.
I also forgot to add the configure-line:

> configure --prefix=/sw/mpi/openmpi/1.8.5-gnu_sschu/
> --enable-orterun-prefix-by-default --enable-mpi-thread-multiple
> --with-verbs --with-tm=/sw/tools/torque/5.1.0/
> CC=/sw/tools/gnu/gcc/4.9.2/bin/gcc-4.9.2
> FC=/sw/tools/gnu/gcc/4.9.2/bin/gfortran-4.9.2

package:Open MPI root@ikarus Distribution
ompi:version:full:1.8.5
ompi:version:repo:v1.8.4-333-g039fb11
ompi:version:release_date:May 05, 2015
orte:version:full:1.8.5
orte:version:repo:v1.8.4-333-g039fb11
orte:version:release_date:May 05, 2015
opal:version:full:1.8.5
opal:version:repo:v1.8.4-333-g039fb11
opal:version:release_date:May 05, 2015
mpi-api:version:full:3.0
ident:1.8.5
path:prefix:/sw/mpi/openmpi/1.8.5-gnu_sschu
path:exec_prefix:/sw/mpi/openmpi/1.8.5-gnu_sschu
path:bindir:/sw/mpi/openmpi/1.8.5-gnu_sschu/bin
path:sbindir:/sw/mpi/openmpi/1.8.5-gnu_sschu/sbin
path:libdir:/sw/mpi/openmpi/1.8.5-gnu_sschu/lib
path:incdir:/sw/mpi/openmpi/1.8.5-gnu_sschu/include
path:mandir:/sw/mpi/openmpi/1.8.5-gnu_sschu/share/man
path:pkglibdir:/sw/mpi/openmpi/1.8.5-gnu_sschu/lib/openmpi
path:libexecdir:/sw/mpi/openmpi/1.8.5-gnu_sschu/libexec
path:datarootdir:/sw/mpi/openmpi/1.8.5-gnu_sschu/share
path:datadir:/sw/mpi/openmpi/1.8.5-gnu_sschu/share
path:sysconfdir:/sw/mpi/openmpi/1.8.5-gnu_sschu/etc
path:sharedstatedir:/sw/mpi/openmpi/1.8.5-gnu_sschu/com
path:localstatedir:/sw/mpi/openmpi/1.8.5-gnu_sschu/var
path:infodir:/sw/mpi/openmpi/1.8.5-gnu_sschu/share/info
path:pkgdatadir:/sw/mpi/openmpi/1.8.5-gnu_sschu/share/openmpi
path:pkglibdir:/sw/mpi/openmpi/1.8.5-gnu_sschu/lib/openmpi
path:pkgincludedir:/sw/mpi/openmpi/1.8.5-gnu_sschu/include/openmpi
config:arch:x86_64-unknown-linux-gnu
config:host:ikarus
config:user:root
config:timestamp:Fri Sep 25 11:06:11 CEST 2015
config:host:ikarus
build:user:root
build:timestamp:Fr 25. Sep 11:26:33 CEST 2015
build:host:ikarus
bindings:c:yes
bindings:cxx:yes
bindings:mpif.h:yes (all)
bindings:use_mpi:yes (full: ignore TKR)
bindings:use_mpi:size:deprecated-ompi-info-value
bindings:use_mpi_f08:yes
bindings:use_mpi_f08:compliance:The mpi_f08 module is available, but due to 
limitations in the /sw/tools/gnu/gcc/4.9.2/bin/gfortran-4.9.2 compiler, does 
not support the following: array subsections, direct passthru (where possible) 
to underlying Open MPI's C functionality
bindings:use_mpi_f08:subarrays-supported:no
bindings:java:no
compiler:all:rpath:runpath
compiler:c:command:/sw/tools/gnu/gcc/4.9.2/bin/gcc-4.9.2
compiler:c:absolute://sw/tools/gnu/gcc/4.9.2/bin/gcc-4.9.2
compiler:c:familyname:GNU
compiler:c:version:4.9.2
compiler:c:sizeof:char:1
compiler:c:sizeof:bool:1
compiler:c:sizeof:short:2
compiler:c:sizeof:int:4
compiler:c:sizeof:long:8
compiler:c:sizeof:float:4
compiler:c:sizeof:double:8
compiler:c:sizeof:pointer:8
compiler:c:align:char:1
compiler:c:align:bool:1
compiler:c:align:int:4
compiler:c:align:float:4
compiler:c:align:double:8
compiler:cxx:command:g++
compiler:cxx:absolute:/usr/bin/g++
compiler:fortran:command:/sw/tools/gnu/gcc/4.9.2/bin/gfortran-4.9.2
compiler:fortran:absolute://sw/tools/gnu/gcc/4.9.2/bin/gfortran-4.9.2
compiler:fortran:ignore_tkr:yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
compiler:fortran:f08_assumed_rank:yes
compiler:fortran:optional_arguments:yes
compiler:fortran:interface:yes
compiler:fortran:iso_fortran_env:yes
compiler:fortran:storage_size:yes
compiler:fortran:bind_c:yes
compiler:fortran:iso_c_binding:yes
compiler:fortran:subroutine_bind_c:yes
compiler:fortran:type_bind_c:yes
compiler:fortran:type_name_bind_c:yes
compiler:fortran:private:yes
compiler:fortran:protected:yes
compiler:fortran:abstract:yes
compiler:fortran:asynchronous:yes
compiler:fortran:procedure:yes
compiler:fortran:c_funloc:yes
compiler:fortran:08_wrappers:yes
compiler:fortran:mpi_sizeof:yes
compiler:fortran:sizeof:integer:4
compiler:fortran:sizeof:logical:4
compiler:fortran:value:true:1
compiler:fortran:have:integer1:yes
compiler:fortran:have:integer2:yes
compiler:fortran:have:integer4:yes
compiler:fortran:have:integer8:yes
compiler:fortran:have:integer16:no
compiler:fortran:have:real4:yes
compiler:fortran:have:real8:yes
compiler:fortran:have:real16:yes
compiler:fortran:have:complex8:yes
compiler:fortran:have:complex16:yes
compiler:fortran:have:complex32:yes
compiler:fortran:sizeof:integer1:1
compiler:fortran:sizeof:integer2:2
compiler:fortran:sizeof:integer4:4
compiler:fortran:sizeof:integer8:8
compiler:fortran:sizeof:integer16:-1
compiler:fortran:sizeof:real:4
compiler:fortran:sizeof:real4:4
compiler:fortran:sizeof:real8:8
compiler:fortran:sizeof:real17:16
compiler:fortran:sizeof:double_precision:8
compiler:fortran:sizeof:complex:8
compiler:fortran:sizeof:double_complex:16
compiler:fortran:sizeof:complex8:8
compiler:fortran:sizeof:complex16:16
compiler:fortran:sizeof:complex32:32
compiler:fortran:align:integer:4
compiler:fortran:align:integer1:1
compiler:fortran:align:integer2:2
compiler:fortran:align:integer4:4

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-28 Thread Nathan Hjelm

I would like to add that you may want to play with the value and see
what works for your applications. Most applications should be using
malloc or similar functions to allocate large memory regions in the heap
and not on the stack.

-Nathan

On Mon, Sep 28, 2015 at 08:01:09PM +0300, Mike Dubman wrote:
>Hello Grigory,
>We observed ~10% performance degradation with heap size set to unlimited
>for CFD applications.
>You can measure your application performance with default and unlimited
>"limits" and select the best setting.
>Kind Regards.
>M
>On Mon, Sep 28, 2015 at 7:36 PM, Grigory Shamov
> wrote:
> 
>  Hi All,
> 
>  We have built OpenMPI (1.8.8., 1.10.0) against Mellanox OFED 2.4 and
>  corresponding MXM. When it runs now, it gives the following warning, per
>  process:
> 
>  [1443457390.911053] [myhist:5891 :0] mxm.c:185  MXM  WARN  The
>  'ulimit -s' on the system is set to 'unlimited'. This may have negative
>  performance implications. Please set the heap size to the default value
>  (10240)
> 
>  We have ulimits for heap (as well as most of the other limits) set
>  unlimited because of applications that might possibly need a lot of RAM.
> 
>  The question is if we should do as MXM wants, or ignore it? Has anyone
>  an
>  experience running recent OpenMPI with MXM enabled, and what kind of
>  ulimits do you have? Any suggestions/comments appreciated, thanks!
> 
>  --
>  Grigory Shamov
> 
>  Westgrid/ComputeCanada Site Lead
>  University of Manitoba
>  E2-588 EITC Building,
>  (204) 474-9625
> 
>  ___
>  users mailing list
>  us...@open-mpi.org
>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>  Link to this post:
>  http://www.open-mpi.org/community/lists/users/2015/09/27697.php
> 
>--
>Kind Regards,
>M.

> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/09/27698.php



pgppGwQTBUC3x.pgp
Description: PGP signature


Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-28 Thread Grigory Shamov
Hi Nathan,
Hi Mike,

Thanks for the quick replies!

My problem is I don't know what are my applications. I mean, I know them,
but we are a general purpose cluster, running in production for quite a
while, and there are everybody, from quantum chemists to machine learners
to bioinformatists. SO a system-wide change might harm some of them; and
doing per-app benchmarking/tuning  looks a bit daunting.

The default behaviour our users are used to was to have unlimited values
for all memory limits. We have set it so a few years ago, as a response
for some user complaints that applications won't start (we set the ulimits
in Torque). 

Is it known (I know every application is different ) how much costs,
performance-wise, to have MXM with good ulimits vs unlimited ulimits, vs
not using MXM at all?

-- 
Grigory Shamov

Westgrid/ComputeCanada Site Lead
University of Manitoba
E2-588 EITC Building,
(204) 474-9625






On 15-09-28 12:58 PM, "users on behalf of Nathan Hjelm"
 wrote:

>
>I would like to add that you may want to play with the value and see
>what works for your applications. Most applications should be using
>malloc or similar functions to allocate large memory regions in the heap
>and not on the stack.
>
>-Nathan
>
>On Mon, Sep 28, 2015 at 08:01:09PM +0300, Mike Dubman wrote:
>>Hello Grigory,
>>We observed ~10% performance degradation with heap size set to
>>unlimited
>>for CFD applications.
>>You can measure your application performance with default and
>>unlimited
>>"limits" and select the best setting.
>>Kind Regards.
>>M
>>On Mon, Sep 28, 2015 at 7:36 PM, Grigory Shamov
>> wrote:
>> 
>>  Hi All,
>> 
>>  We have built OpenMPI (1.8.8., 1.10.0) against Mellanox OFED 2.4
>>and
>>  corresponding MXM. When it runs now, it gives the following
>>warning, per
>>  process:
>> 
>>  [1443457390.911053] [myhist:5891 :0] mxm.c:185  MXM  WARN
>>The
>>  'ulimit -s' on the system is set to 'unlimited'. This may have
>>negative
>>  performance implications. Please set the heap size to the default
>>value
>>  (10240)
>> 
>>  We have ulimits for heap (as well as most of the other limits) set
>>  unlimited because of applications that might possibly need a lot
>>of RAM.
>> 
>>  The question is if we should do as MXM wants, or ignore it? Has
>>anyone
>>  an
>>  experience running recent OpenMPI with MXM enabled, and what kind
>>of
>>  ulimits do you have? Any suggestions/comments appreciated, thanks!
>> 
>>  --
>>  Grigory Shamov
>> 
>>  Westgrid/ComputeCanada Site Lead
>>  University of Manitoba
>>  E2-588 EITC Building,
>>  (204) 474-9625
>> 
>>  ___
>>  users mailing list
>>  us...@open-mpi.org
>>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>  Link to this post:
>>  http://www.open-mpi.org/community/lists/users/2015/09/27697.php
>> 
>>--
>>Kind Regards,
>>M.
>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>>http://www.open-mpi.org/community/lists/users/2015/09/27698.php
>



Re: [OMPI users] Using POSIX shared memory as send buffer

2015-09-28 Thread Dave Goodell (dgoodell)
On Sep 27, 2015, at 1:38 PM, marcin.krotkiewski  
wrote:
> 
> Hello, everyone
> 
> I am struggling a bit with IB performance when sending data from a POSIX 
> shared memory region (/dev/shm). The memory is shared among many MPI 
> processes within the same compute node. Essentially, I see a bit hectic 
> performance, but it seems that my code it is roughly twice slower than when 
> using a usual, malloced send buffer.

It may have to do with NUMA effects and the way you're allocating/touching your 
shared memory vs. your private (malloced) memory.  If you have a 
multi-NUMA-domain system (i.e., any 2+ socket server, and even some 
single-socket servers) then you are likely to run into this sort of issue.  The 
PCI bus on which your IB HCA communicates is almost certainly closer to one 
NUMA domain than the others, and performance will usually be worse if you are 
sending/receiving from/to a "remote" NUMA domain.

"lstopo" and other tools can sometimes help you get a handle on the situation, 
though I don't know if it knows how to show memory affinity.  I think you can 
find memory affinity for a process via "/proc//numa_maps".  There's lots 
of info about NUMA affinity here: https://queue.acm.org/detail.cfm?id=2513149

-Dave



[OMPI users] send_request error with allocate

2015-09-28 Thread Diego Avesani
Dear all,

I have to use a send_request in a MPI_WAITALL.
Here the strange things:

If I use at the begging of the SUBROUTINE:

INTEGER :: send_request(3), recv_request(3)

I have no problem, but if I use

USE COMONVARS,ONLY : nMsg
with nMsg=3

and after that I declare

INTEGER :: send_request(nMsg), recv_request(nMsg), I get the following
error:

[Lap] *** An error occurred in MPI_Waitall
[Lap] *** reported by process [139726485585921,0]
[Lap] *** on communicator MPI_COMM_WORLD
[Lap] *** MPI_ERR_REQUEST: invalid request
[Lap] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now
abort,
[Lap] ***and potentially your MPI job)
forrtl: error (78): process killed (SIGTERM)

Someone could please explain to me where I am wrong?

Thanks

Diego


Re: [OMPI users] send_request error with allocate

2015-09-28 Thread Jeff Squyres (jsquyres)
Can you send a small reproducer program?

> On Sep 28, 2015, at 4:45 PM, Diego Avesani  wrote:
> 
> Dear all, 
> 
> I have to use a send_request in a MPI_WAITALL.
> Here the strange things:
> 
> If I use at the begging of the SUBROUTINE:
> 
> INTEGER :: send_request(3), recv_request(3) 
> 
> I have no problem, but if I use
> 
> USE COMONVARS,ONLY : nMsg
> with nMsg=3
> 
> and after that I declare
> 
> INTEGER :: send_request(nMsg), recv_request(nMsg), I get the following error:
> 
> [Lap] *** An error occurred in MPI_Waitall 
> [Lap] *** reported by process [139726485585921,0] 
> [Lap] *** on communicator MPI_COMM_WORLD 
> [Lap] *** MPI_ERR_REQUEST: invalid request 
> [Lap] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
> abort, 
> [Lap] ***and potentially your MPI job) 
> forrtl: error (78): process killed (SIGTERM)
> 
> Someone could please explain to me where I am wrong?
> 
> Thanks
> 
> Diego
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/09/27703.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/