Here is sketch of a ROMIO patch for Open MPI. I just wrote it, I didn't had time to test it. If you can test it please let me know if this solve the problem.

  Thanks,
    george.

Index: iscontig.c
===================================================================
--- iscontig.c  (revision 17399)
+++ iscontig.c  (working copy)
@@ -58,6 +58,20 @@
     *flag = MPI_SGI_type_is_contig(datatype) && (displacement == 0);
 }

+#elif defined(OMPI_MPI_H)
+
+#include "ompi/datatype/datatype.h"
+
+void ADIOI_Datatype_iscontig(MPI_Datatype datatype, int *flag)
+{
+    /*
+     * Open MPI contiguous check return true for datatype with
+     * gaps in the beginning and at the end. We have to provide
+     * a count of 2 in order to get these gaps taken into acount.
+     */
+    *flag = ompi_ddt_is_contiguous_memory_layout( datatype, 2);
+}
+
 #else


On Feb 8, 2008, at 12:26 PM, Rainer Keller wrote:

Hi George,
Good, if You come to the same conclusion with regard to romio using
MPI_Type_size internally in RomIO...


So taking iscontig.c ,-]
   /* This function needs more work. It should check for contiguity
      in other cases as well.*/
and mail to the romio list or have a specialized version of
ADIOI_Datatype_iscontig for ompi ,-]

Either way, the mpi_test_suite in that regard is sane.


Thanks,
Rainer


On Friday 08 February 2008 18:22, George Bosilca wrote:
MPI_Type_size is supposed to return only the size of useful data,
which apparently it does (MPI_SHORT_INT is 6 bytes). What I think it
happens is that the MPI_SHORT_INT type is a predefined one, but it's a
really strange predefined type. It's one of the few that are not
contiguous. The problem seems to come from the fact that the
MPI_File_write do a contiguous write for the predefined data types,
making the assumption that they are all contiguous.

I tracked the problem down in the romio/adio/common/is_contig.c file.
For Open MPI the last #else branch is used. The first case in the
switch check for the MPI_COMBINER_NAMED (which is what an MPI is
supposed to return for predefined data types) and set the flag to 1
(which means contiguous). This is obviously wrong for MPI_SHORT_INT.
It really look like a ROMIO problem, so I guess this email should be
redirected to their mailing list.

  Thanks,
    george.

On Feb 8, 2008, at 12:50 PM, Christoph Niethammer wrote:
Hello!

I tested openMPI at HLRS for some time without detecting new
problems in the
implementation but now I recognized some awful ones with MPI_Write
which can
lead to data los:

When creating a struct for a mixed datatype like

struct {
short a;
int b;
}

the C-compiler introduce a gap of 2 bytes in the data representation
for this
type due to the 4byte alignment of the integer on 32bit systems.

If I now try to use MPI_File_write to write these data to a file and
use
MPI_SHORT_INT as mpi_datatype this leads to a data los.

I located the problem at the combined use of "write" and
MPI_Type_size in
MPI_File_write.
So MPI_Type_size(MPI_SHORT_INT) returns 6 bytes where the struct
uses 8 bytes
in memory as there is a gap of 2 bytes. The write function in
ad_write.c now
leads to the los of the data because the gaps are not within the
calculation
of the complete data size to be written into the file.

This problem occures also in the other io functions.
As far as I could find out the problem seems not to be present with
derived
data types.

The question is now how to "fix":
i) Either the MPI_Standard is not clear in this point and the data
types
MPI_SHORT_INT, MPI_DOUBLE_INT, ... should be forbidden to be used with
structs of these types,
ii) Or the implementation of the MPI_Type_size function has to be
modified to
return the value of eg. true_ub which contains the correct value
iii) Or the MPI_File_write function has not to use the write
function in
the "continues" way on the data and should take care of the gaps.

Regards

Christoph Niethammer
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
----------------------------------------------------------------
Dipl.-Inf. Rainer Keller   http://www.hlrs.de/people/keller
HLRS                          Tel: ++49 (0)711-685 6 5858
Nobelstrasse 19                  Fax: ++49 (0)711-685 6 5832
70550 Stuttgart                    email: kel...@hlrs.de
Germany                             AIM/Skype:rusraink

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to