Hi George, Good, if You come to the same conclusion with regard to romio using MPI_Type_size internally in RomIO...
So taking iscontig.c ,-] /* This function needs more work. It should check for contiguity in other cases as well.*/ and mail to the romio list or have a specialized version of ADIOI_Datatype_iscontig for ompi ,-] Either way, the mpi_test_suite in that regard is sane. Thanks, Rainer On Friday 08 February 2008 18:22, George Bosilca wrote: > MPI_Type_size is supposed to return only the size of useful data, > which apparently it does (MPI_SHORT_INT is 6 bytes). What I think it > happens is that the MPI_SHORT_INT type is a predefined one, but it's a > really strange predefined type. It's one of the few that are not > contiguous. The problem seems to come from the fact that the > MPI_File_write do a contiguous write for the predefined data types, > making the assumption that they are all contiguous. > > I tracked the problem down in the romio/adio/common/is_contig.c file. > For Open MPI the last #else branch is used. The first case in the > switch check for the MPI_COMBINER_NAMED (which is what an MPI is > supposed to return for predefined data types) and set the flag to 1 > (which means contiguous). This is obviously wrong for MPI_SHORT_INT. > It really look like a ROMIO problem, so I guess this email should be > redirected to their mailing list. > > Thanks, > george. > > On Feb 8, 2008, at 12:50 PM, Christoph Niethammer wrote: > > Hello! > > > > I tested openMPI at HLRS for some time without detecting new > > problems in the > > implementation but now I recognized some awful ones with MPI_Write > > which can > > lead to data los: > > > > When creating a struct for a mixed datatype like > > > > struct { > > short a; > > int b; > > } > > > > the C-compiler introduce a gap of 2 bytes in the data representation > > for this > > type due to the 4byte alignment of the integer on 32bit systems. > > > > If I now try to use MPI_File_write to write these data to a file and > > use > > MPI_SHORT_INT as mpi_datatype this leads to a data los. > > > > I located the problem at the combined use of "write" and > > MPI_Type_size in > > MPI_File_write. > > So MPI_Type_size(MPI_SHORT_INT) returns 6 bytes where the struct > > uses 8 bytes > > in memory as there is a gap of 2 bytes. The write function in > > ad_write.c now > > leads to the los of the data because the gaps are not within the > > calculation > > of the complete data size to be written into the file. > > > > This problem occures also in the other io functions. > > As far as I could find out the problem seems not to be present with > > derived > > data types. > > > > The question is now how to "fix": > > i) Either the MPI_Standard is not clear in this point and the data > > types > > MPI_SHORT_INT, MPI_DOUBLE_INT, ... should be forbidden to be used with > > structs of these types, > > ii) Or the implementation of the MPI_Type_size function has to be > > modified to > > return the value of eg. true_ub which contains the correct value > > iii) Or the MPI_File_write function has not to use the write > > function in > > the "continues" way on the data and should take care of the gaps. > > > > Regards > > > > Christoph Niethammer > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- ---------------------------------------------------------------- Dipl.-Inf. Rainer Keller http://www.hlrs.de/people/keller HLRS Tel: ++49 (0)711-685 6 5858 Nobelstrasse 19 Fax: ++49 (0)711-685 6 5832 70550 Stuttgart email: kel...@hlrs.de Germany AIM/Skype:rusraink