Jeff,
imho, this is a grey area ...
99.999% of the time, posix_memalign is a "pure" function.
"pure" means it has no side effects.
unfortunatly, this part of the code is the 0.001% case in which we
explicitly rely on a side effect
(e.g. posix_memalign calls an Open MPI wrapper that updates a global
variable)
i am not surprised (nor shocked) the compiler assumes posix_memalign is
side effect free when (aggressive) optimisation is on, since it is a
valid asumption most of the time.
Of course, the is a chance that might be a bug
(e.g. GCC folks forgot a wrapper can be added to posix_memalign, so it
is not 100% safe to assume posix_memalign is side effect free)
but as far as i am concerned, using "volatile" is correct.
Cheers,
Gilles
On 6/5/2015 10:26 AM, Jeff Squyres (jsquyres) wrote:
On Jun 4, 2015, at 5:48 AM, René Oertel <rene.oer...@cs.tu-chemnitz.de> wrote:
Problem description:
===================
The critical code in question is in
opal/mca/memory/linux/memory_linux_ptmalloc2.c:
#####
92 #if HAVE_POSIX_MEMALIGN
93 /* Double check for posix_memalign, too */
94 if (mca_memory_linux_component.memalign_invoked) {
95 mca_memory_linux_component.memalign_invoked = false;
96 if (0 != posix_memalign(&p, sizeof(void*), 1024 * 1024)) {
97 return OPAL_ERR_IN_ERRNO;
98 }
99 free(p);
100 }
101 #endif
102
103 if (mca_memory_linux_component.malloc_invoked &&
104 mca_memory_linux_component.realloc_invoked &&
105 mca_memory_linux_component.memalign_invoked &&
106 mca_memory_linux_component.free_invoked) {
107 /* Happiness; our functions were invoked */
108 val |= OPAL_MEMORY_FREE_SUPPORT | OPAL_MEMORY_CHUNK_SUPPORT;
109 }
[...]
121 /* All done */
122 if (val > 0) {
123 opal_mem_hooks_set_support(val);
124 return OPAL_SUCCESS;
125 }
#####
The code at lines 103-109 is legally optimized away with >=GCC-4.9 and
optimizations turned on,
I'm not sure what you mean by "legally optimized away" -- shouldn't gcc know that the
call to posix_memalign() can change global variables (such as
mca_memory_linux_component.<foo>)?
because line 105 could never become true with
the local knowledge of the compiler/optimizer.
My compiler knowledge may be a bit rusty, but if this optimization is being
made solely with local knowledge, this sounds like a buggy optimization...?
If mca_memory_linux_component.memalign_invoked == true at line 92, it
would be set to false at line 95.
...but potentially set to true again in the body of posix_memalign(). ...which
you describe below.
If mca_memory_linux_component.memalign_invoked == false at line 92, it
would be false at line 103, too.
In both cases, the if at line 103-106 could never be evaluated to true
and opal_mem_hooks_set_support is never called with
OPAL_MEMORY_FREE_SUPPORT | OPAL_MEMORY_CHUNK_SUPPORT resulting in
(silently) disabled mpi_leaved_pinned.
In the global view this local assumption is not true, because
posix_memalign() in line 96 will call the hook public_mEMALIGn() in
opal/mca/memory/linux/malloc.c which in turn sets
mca_memory_linux_component.memalign_invoked = true.
As a result, the OPAL_MEMORY_*_SUPPORT would get configured correctly in
line 123 and so the opal_mem_hooks_support_level() used by
ompi/mca/btl/openib/btl_openib_component.c and indirectly by the
ompi/mca/mpool/grdma/mpool_grdma* module enables the usage of pinned memory.
How can a compiler not take the global view? Taking a local-only view is
unsafe -- for exactly these kinds of reasons.
The optimization could be disabled by adding -fno-tree-partial-pre to
the CFLAGS in opal/mca/memory/linux/Makefile, but this is only a
temporary workaround.
Patch:
=====
The proposed patch is as follows, which alters the point-of-view of the
compiler's optimizer on the *_invoked variables used by different code
paths (memory_linux_ptmalloc2.c vs. malloc.c):
#####
diff -rupN openmpi-1.8.5.org/opal/mca/memory/linux/memory_linux.h
openmpi-1.8.5/opal/mca/memory/linux/memory_linux.h
--- openmpi-1.8.5.org/opal/mca/memory/linux/memory_linux.h
2014-10-03 22:32:23.000000000 +0200
+++ openmpi-1.8.5/opal/mca/memory/linux/memory_linux.h 2015-06-04
10:01:44.941544282 +0200
@@ -33,11 +33,11 @@ typedef struct opal_memory_linux_compone
#if MEMORY_LINUX_PTMALLOC2
/* Ptmalloc2-specific data */
- bool free_invoked;
- bool malloc_invoked;
- bool realloc_invoked;
- bool memalign_invoked;
- bool munmap_invoked;
+ volatile bool free_invoked;
+ volatile bool malloc_invoked;
+ volatile bool realloc_invoked;
+ volatile bool memalign_invoked;
+ volatile bool munmap_invoked;
#endif
} opal_memory_linux_component_t;
I don't really want to apply this patch (thanks for filing the PR, BTW) without
understanding why the compiler is not automatically taking the global view. It
seems unsafe.
#####
Additionally, a further patch should be applied generating a warning in
the GPUDirect module if leave_pinned is not available for some reason.
In this case, GPUDirect support should be disabled, because it runs
faster without (see Case 2 below).
I'll let Rolf / NVIDIA comment on the GPU implications.
[snip]
Best regards,
René "oere" Oertel
Many thanks for the detailed report!