Bug#902101: openmpi: OpenMPI 3 on ppc64el seems unstable

2018-06-23 Thread Alastair McKinstry

Hi,

There is a patch in 3.1.0-7 for powerpc / ppc*64 systems:


   Index: openmpi-3.0.1~rc1/opal/include/opal/sys/powerpc/atomic.h
   ===
   --- openmpi-3.0.1~rc1.orig/opal/include/opal/sys/powerpc/atomic.h
   +++ openmpi-3.0.1~rc1/opal/include/opal/sys/powerpc/atomic.h
   @@ -27,6 +27,13 @@
  * On powerpc ...
  */

   +/* Hack on Debian. See: https://github.com/open-mpi/ompi/issues/2055
   + *   -- amck, 2016-09-05
   + */
   +#undef OPAL_GCC_INLINE_ASSEMBLY
   +#define OPAL_GCC_INLINE_ASSEMBLY 1
   +
   +
 #define MB()  __asm__ __volatile__ ("sync" : : : "memory")
 #define RMB() __asm__ __volatile__ ("lwsync" : : : "memory")
 #define WMB() __asm__ __volatile__ ("lwsync" : : : "memory")

I've just checked and this fix is no longer needed. It may be 
interfering with the locking primitives used and hence I've removed it 
(in git).


Simarly for a patch for arm64 (note: not armel, so possibly not relevant 
for #902041).


I'm uploading a release with these removed.


Regards

Alastair




On 22/06/2018 09:18, Ansgar Burchardt wrote:

Source: openmpi
Version: 3.1.0-7
Severity: important

OpenMPI 3 on ppc64el seems unstable: the test suite for dune-grid only
passes ~30% of the time on ppc64el. In the other cases it hangs in MPI
communication or gets unexpected results.

I haven't been able to find a simpler test case yet (unlike armel,
#902041).

(Maybe related: on powerpc all MPI tests in dune-grid just run into a
timeout, but that one is not a release arch so I care less.)

Ansgar


--
Alastair McKinstry, , , 
https://diaspora.sceal.ie/u/amckinstry
Commander Vimes didn’t like the phrase “The innocent have nothing to fear,”
 believing the innocent had everything to fear, mostly from the guilty but in 
the longer term
 even more from those who say things like “The innocent have nothing to fear.”
 - T. Pratchett, Snuff



Bug#902101: openmpi: OpenMPI 3 on ppc64el seems unstable

2018-06-22 Thread Ansgar Burchardt
Source: openmpi
Version: 3.1.0-7
Severity: important

OpenMPI 3 on ppc64el seems unstable: the test suite for dune-grid only
passes ~30% of the time on ppc64el. In the other cases it hangs in MPI
communication or gets unexpected results.

I haven't been able to find a simpler test case yet (unlike armel,
#902041).

(Maybe related: on powerpc all MPI tests in dune-grid just run into a
timeout, but that one is not a release arch so I care less.)

Ansgar