Re: [OMPI users] mmaped memory and openib btl.

2014-12-02 Thread Jeff Squyres (jsquyres)
On Dec 2, 2014, at 3:15 PM, Emmanuel Thomé wrote: > Thanks for pointing me to ummunotify, this sounds much more robust > than the fragile hook-based approach. I'll try this out. It is -- see: https://github.com/open-mpi/ompi/blob/master/README#L665-L682 ummunotify is distributed as part of

Re: [OMPI users] mmaped memory and openib btl.

2014-12-02 Thread Emmanuel Thomé
Hi, Thanks much for your answer. I agree, this can wait for 1.8.5, since as you say there are workarounds. I'm gonna #if-protect the offending block in my code based on OMPI version range, and have a fallback which does not use mmap/munmap. I'll follow the thread on the github tracker. Thanks f

Re: [OMPI users] mmaped memory and openib btl.

2014-12-02 Thread Jeff Squyres (jsquyres)
You got caught in the SC/US Thanksgiving holiday delay. Sorry about that! We talked about this on the weekly call today, and decided the following: 1. We're going to push this to 1.8.5, for two main reasons: 1a. There's workarounds available. 1b. We need some time to figure this out, but would

Re: [OMPI users] mmaped memory and openib btl.

2014-11-29 Thread Emmanuel Thomé
Hi, I am still affected by the bug which I reported in the thread below (munmapped area lingers in registered memory cache). I'd just like to know if this is recognized as a defect, and whether a fix could be considered, or if instead I should consider the failure I observe as being "normal behavi

Re: [OMPI users] mmaped memory and openib btl.

2014-11-13 Thread Emmanuel Thomé
Hi, It turns out that the DT_NEEDED libs for my a.out are: Dynamic Section: NEEDED libmpi.so.1 NEEDED libpthread.so.0 NEEDED libc.so.6 which is absolutely consistent with the link command line: catrel-44 ~ $ mpicc -W -Wall -std=c99 -O0 -g prog6.c -sh

Re: [OMPI users] mmaped memory and openib btl.

2014-11-12 Thread Emmanuel Thomé
yes I confirm. Thanks for saying that this is the supposed behaviour. In the binary, the code goes to munmap@plt, which goes to the libc, not to libopen-pal.so libc is 2.13-38+deb7u1 I'm a total noob at got/plt relocations. What is the mechanism which should make the opal relocation win over the

Re: [OMPI users] mmaped memory and openib btl.

2014-11-12 Thread Jeff Squyres (jsquyres)
FWIW, munmap is *supposed* to be intercepted. Can you confirm that when your application calls munmap, it doesn't make a call to libopen-pal.so? It should be calling this (1-line) function: - /* intercept munmap, as the user can give back memory that way as well. */ OPAL_DECLSPEC int munmap

Re: [OMPI users] mmaped memory and openib btl.

2014-11-12 Thread Nathan Hjelm
You could just disable leave pinned: -mca mpi_leave_pinned 0 -mca mpi_leave_pinned_pipeline 0 This will fix the issue but may reduce performance. Not sure why the munmap wrapper is failing to execute but this will get you running. -Nathan Hjelm HPC-5, LANL On Wed, Nov 12, 2014 at 05:08:06PM +0

Re: [OMPI users] mmaped memory and openib btl.

2014-11-12 Thread Emmanuel Thomé
As far as I have been able to understand while looking at the code, it very much seems that Joshua pointed out the exact cause for the issue. munmap'ing a virtual address space region does not evict it from mpool_grdma->pool->lru_list . If a later mmap happens to return the same address (a priori