Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Michael Rezny
Hi Gilles, thanks for the detailed explanation. Have a nice weekend Mike On 12/02/2016, at 11:23 PM, Gilles Gouaillardet wrote: > Michael, > > Per the specifications, MPI_Pack_external and MPI_Unpack_external must > pack/unpack to/from big endian, regardless the endianness of the host. > On a

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Gilles Gouaillardet
Michael, Per the specifications, MPI_Pack_external and MPI_Unpack_external must pack/unpack to/from big endian, regardless the endianness of the host. On a little endian system, byte swapping must occur because this is what you are explicitly requesting. These functions are really meant to be used

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Michael Rezny
Hi Gilles, I am misunderstanding something here. What you are now saying seems, to me, to be at odds with what you said previously. Assume the situation where both sender and receiver are little-endian, and discussing only MPI_Pack_external, and MPI_Unpack_external Consider case 1 --enable-hete

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Gilles Gouaillardet
Michael, byte swapping only occurs if you invoke MPI_Pack_external and MPI_Unpack_external on little endianness systems. MPI_Pack and MPI_Unpack uses the same engine that MPI_Send and MPI_Recv and this does not involve any byte swapping if both ends have the same endianness. Cheers, Gilles On

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Michael Rezny
Hi, oh, that is good news! The process is meant to be implementing "receiver makes right" which is good news for efficiency. But, in the second case, without --enable-heterogeneous, are you saying that on little-endian machines, byte swapping is meant to always occur? That seems most odd. I woul

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Gilles Gouaillardet
Michael, i'd like to correct what i wrote earlier in heterogeneous clusters, data is sent "as is" (e.g. no byte swapping) and it is byte swapped when received and only if needed. with --enable-heterogeneous, MPI_Unpack_external is working, but MPI_Pack_external is broken (e.g. no byte swappi

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-11 Thread Ralph Castain
I can’t speak to the packing question, but I can say that we have indeed confirmed the lack of maintenance on OMPI for Debian/Ubuntu and are working to resolve the problem. > On Feb 11, 2016, at 1:16 AM, Gilles Gouaillardet > wrote: > > Michael, > > MPI_Pack_external must convert data to big

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-11 Thread Gilles Gouaillardet
Michael, MPI_Pack_external must convert data to big endian, so it can be dumped into a file, and be read correctly on big and little endianness arch, and with any MPI flavor. if you use only one MPI library on one arch, or if data is never read/written from/to a file, then it is more efficient to

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-11 Thread Michael Rezny
Hi Gilles, I enhanced my simple test program to dump the contents of the buffer: If I am not mistaken, it appears that the unpack is not doing the endian conversion. kindest regards Mike Good: send data 04d2 162e MPI_Pack_external: 0 buffer size: 8 Buffer contents d2, 04, 00, 00, 2e, 16,

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-11 Thread Michael Rezny
Hi Gilles, thanks for thinking about this in more detail. I understand what you are saying, but your comments raise some questions in my mind: If one is in a homogeneous cluster, is it important that, in the case of little-endian, that the data be converted to extern32 format (big-endian), only

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-11 Thread Gilles Gouaillardet
Michael, I think it is worst than that ... without --enable-heterogeneous, it seems the data is not correctly packed (e.g. it is not converted to big endian), at least on a x86_64 arch. unpack looks broken too, but pack followed by unpack does work. that means if you are reading data correctly wr

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-11 Thread Michael Rezny
Hi Ralph, you are indeed correct. However, many of our users have workstations such as me, with OpenMPI provided by installing a package. So we don't know what has been configured. Then we have failures, since, for instance, Ubuntu 14.04 by default appears to have been built with heterogeneous sup

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-10 Thread Ralph Castain
Out of curiosity: if both systems are Intel, they why are you enabling hetero? You don’t need it in that scenario. Admittedly, we do need to fix the bug - just trying to understand why you are configuring that way. > On Feb 10, 2016, at 8:46 PM, Michael Rezny wrote: > > Hi Gilles, > I can co

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-10 Thread Michael Rezny
Hi Gilles, I can confirm that with a fresh download and build from source for OpenMPI 1.10.2 with --enable-heterogeneous the unpacked ints are the wrong endian. However, without --enable-heterogeneous, the unpacked ints are correct. So, this problem still exists in heterogeneous builds with OpenM

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-10 Thread Michael Rezny
Hi Gilles, thanks for the prompt response and assistance. Both systems use Intel CPUs. The problem originally comes from a coupler, yac, used in climate science. There are several reported instances where the coupling tests fail. The problem occurs often enough to incorporate a workaround which

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-10 Thread Gilles Gouaillardet
Michael, does your two systems have the same endianness ? do you know how openmpi was configure'd on both systems ? (is --enable-heterogeneous enabled or disabled on both systems ?) fwiw, openmpi 1.6.5 is old now and no more maintained. I strongly encourage you to use openmpi 1.10.2 Cheers, Gi