Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

Michael.Rachner Mon, 27 Oct 2014 06:49:09 -0400 (EDT)

Dear Mr. Squyres.

We will try to install your bug-fixed nigthly tarball of 2014-10-24 on Cluster5 
to see whether it works or not.
The installation however will take some time. I get back to you, if I know more.


Let me add the information that on the Laki each nodes has 16 GB of shared 
memory (there it worked),
the login-node on Cluster 5 has 64 GB (there it worked too), whereas the 
compute nodes on Cluster5 have 128 GB (there it did not work).
So possibly the bug might have something to do with the size of the physical 
shared memory available on the node.

Greetings
Michael Rachner

-----Ursprüngliche Nachricht-----
Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Jeff Squyres 
(jsquyres)
Gesendet: Freitag, 24. Oktober 2014 22:45
An: Open MPI User's List
Betreff: Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared 
memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

Nathan tells me that this may well be related to a fix that was literally just 
pulled into the v1.8 branch today:

    https://github.com/open-mpi/ompi-release/pull/56

Would you mind testing any nightly tarball after tonight?  (i.e., the v1.8 
tarballs generated tonight will be the first ones to contain this fix)

    http://www.open-mpi.org/nightly/master/



On Oct 24, 2014, at 11:46 AM, <michael.rach...@dlr.de> <michael.rach...@dlr.de> 
wrote:

> Dear developers of OPENMPI,
>  
> I am running a small downsized Fortran-testprogram for shared memory 
> allocation (using MPI_WIN_ALLOCATE_SHARED and  MPI_WIN_SHARED_QUERY) )
> on only 1 node   of 2 different Linux-clusters with OPENMPI-1.8.3 and 
> Intel-14.0.4 /Intel-13.0.1, respectively.
>  
> The program simply allocates a sequence of shared data windows, each 
> consisting of 1 integer*4-array.
> None of the windows is freed, so the amount of allocated data  in shared 
> windows raises during the course of the execution.
>  
> That worked well on the 1st cluster (Laki, having 8 procs per node))  
> when allocating even 1000 shared windows each having 50000 integer*4 array 
> elements, i.e. a total of  200 MBytes.
> On the 2nd cluster (Cluster5, having 24 procs per node) it also worked on the 
> login node, but it did NOT work on a compute node.
> In that error case, there occurs something like an internal storage limit of 
> ~ 140 MB for the total storage allocated in all shared windows.
> When that limit is reached, all later shared memory allocations fail (but 
> silently).
> So the first attempt to use such a bad shared data window results in a bus 
> error due to the bad storage address encountered.
>  
> That strange behavior could be observed in the small testprogram but also 
> with my large Fortran CFD-code.
> If the error occurs, then it occurs with both codes, and both at a storage 
> limit of  ~140 MB.
> I found that this storage limit depends only weakly on  the number of 
> processes (for np=2,4,8,16,24  it is: 144.4 , 144.0, 141.0, 137.0, 
> 132.2 MB)
>  
> Note that the shared memory storage available on both clusters was very large 
> (many GB of free memory).
>  
> Here is the error message when running with np=2 and an  array 
> dimension of idim_1=50000  for the integer*4 array allocated per shared 
> window on the compute node of Cluster5:
> In that case, the error occurred at the 723-th shared window, which is the 
> 1st badly allocated window in that case:
> (722 successfully allocated shared windows * 50000 array elements * 4 
> Bytes/el. = 144.4 MB)
>  
>  
> [1,0]<stdout>: ========on nodemaster: iwin=         722 :
> [1,0]<stdout>:  total storage [MByte] alloc. in shared windows so far:   
> 144.400000000000
> [1,0]<stdout>: =========== allocation of shared window no. iwin=         723
> [1,0]<stdout>:  starting now with idim_1=       50000
> [1,0]<stdout>: ========on nodemaster for iwin=         723 : before writing 
> on shared mem
> [1,0]<stderr>:[r5i5n13:12597] *** Process received signal *** 
> [1,0]<stderr>:[r5i5n13:12597] Signal: Bus error (7) 
> [1,0]<stderr>:[r5i5n13:12597] Signal code: Non-existant physical 
> address (2) [1,0]<stderr>:[r5i5n13:12597] Failing at address: 
> 0x7fffe08da000 [1,0]<stderr>:[r5i5n13:12597] [ 0] 
> [1,0]<stderr>:/lib64/libpthread.so.0(+0xf800)[0x7ffff6d67800]
> [1,0]<stderr>:[r5i5n13:12597] [ 1] ./a.out[0x408a8b] 
> [1,0]<stderr>:[r5i5n13:12597] [ 2] ./a.out[0x40800c] 
> [1,0]<stderr>:[r5i5n13:12597] [ 3] 
> [1,0]<stderr>:/lib64/libc.so.6(__libc_start_main+0xe6)[0x7ffff69fec36]
> [1,0]<stderr>:[r5i5n13:12597] [ 4] [1,0]<stderr>:./a.out[0x407f09] 
> [1,0]<stderr>:[r5i5n13:12597] *** End of error message ***
> [1,1]<stderr>:forrtl: error (78): process killed (SIGTERM)
> [1,1]<stderr>:Image              PC                Routine            Line    
>     Source
> [1,1]<stderr>:libopen-pal.so.6   00007FFFF4B74580  Unknown               
> Unknown  Unknown
> [1,1]<stderr>:libmpi.so.1        00007FFFF7267F3E  Unknown               
> Unknown  Unknown
> [1,1]<stderr>:libmpi.so.1        00007FFFF733B555  Unknown               
> Unknown  Unknown
> [1,1]<stderr>:libmpi.so.1        00007FFFF727DFFD  Unknown               
> Unknown  Unknown
> [1,1]<stderr>:libmpi_mpifh.so.2  00007FFFF779BA03  Unknown               
> Unknown  Unknown
> [1,1]<stderr>:a.out              0000000000408D15  Unknown               
> Unknown  Unknown
> [1,1]<stderr>:a.out              000000000040800C  Unknown               
> Unknown  Unknown
> [1,1]<stderr>:libc.so.6          00007FFFF69FEC36  Unknown               
> Unknown  Unknown
> [1,1]<stderr>:a.out              0000000000407F09  Unknown               
> Unknown  Unknown
> ----------------------------------------------------------------------
> ---- mpiexec noticed that process rank 0 with PID 12597 on node 
> r5i5n13 exited on signal 7 (Bus error).
> ----------------------------------------------------------------------
> ----
>  
>  
> The small Ftn-testprogram was built by   
>   mpif90 sharedmemtest.f90
>   mpiexec -np 2 -bind-to core -tag-output ./a.out
>  
> Why does it work on the Laki  (both on login-node and on a compute 
> node)  as well as on the login-node of Cluster5, but fails on an compute node 
> of Cluster5?
>  
> Greetings
>    Michael Rachner
>  
>  
>  
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25572.php


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/10/25580.php

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

Reply via email to