Pascal,

Patch accepted. In the trunk r22864. I'll create CMR for the 1.4 and 1.5 and 
push them there asap.

  Thanks,
    george.

On Mar 22, 2010, at 06:28 , Pascal Deveze wrote:

> George,
> 
> You are right.
> - I agree with you: The Open MPI ompi_datatype_is_contigous_memory runs 
> correctly.
> - The problem comes with ROMIO: They need a function that returns true if the 
> content is contiguous AND the content start at the pointer position 
> (displacement zero).
> - MPI Datatypes are a fanny world  ;-)
> 
> If you take a look at source ompi/mca/io/romio/romio/adio/common/iscontig.c 
> you will see:
> 
> #if (defined(MPICH) || defined(MPICH2))
> /* MPICH2 also provides this routine */
> void MPIR_Datatype_iscontig(MPI_Datatype datatype, int *flag);
> 
> void ADIOI_Datatype_iscontig(MPI_Datatype datatype, int *flag)
> {
>  MPIR_Datatype_iscontig(datatype, flag);
> 
>  /* if it is MPICH2 and the datatype is reported as contigous,
>     check if the true_lb is non-zero, and if so, mark the
>     datatype as noncontiguous */
> #ifdef MPICH2
>  if (*flag) {
>      MPI_Aint true_extent, true_lb;
> 
>      MPI_Type_get_true_extent(datatype, &true_lb, &true_extent);
> 
>      if (true_lb > 0)
>          *flag = 0;
>  }
> #endif
> }
> #elif ....
> 
> My proposition is just to take these 12 last lines and put them into 
> ompi/mca/io/romio/src/io_romio_module.c to
> conform to what ROMIO wants.
> If my proposition is accepted, just take my patch:
> 
> diff -u ompi/mca/io/romio/src/io_romio_module.c 
> ompi/mca/io/romio/src/io_romio_module.c.OLD
> --- ompi/mca/io/romio/src/io_romio_module.c     2010-03-19 11:19:57.000000000 
> +0100
> +++ ompi/mca/io/romio/src/io_romio_module.c.OLD 2010-03-22 11:05:57.000000000 
> +0100
> @@ -133,12 +133,4 @@
>    * a count of 2 in order to get these gaps taken into acount.
>    */
>   *flag = ompi_datatype_is_contiguous_memory_layout(datatype, 2);
> -    if (*flag) {
> -        MPI_Aint true_extent, true_lb;
> -
> -        ompi_datatype_get_true_extent(datatype, &true_lb, &true_extent);
> -
> -        if (true_lb > 0)
> -            *flag = 0;
> -    }
> }
> 
> Pascal
> 
> 
> 
> On Mar 19, 2010, at 11:52, George Bosilca wrote:
>> 
>> Pascal,
>> 
>> I went inside the code, and I have to say it's a long tricky story. Let me 
>> try to sort it out:
>> 
>> - you create two types:
>>  - the indexed one containing just one element. This type is contiguous as 
>> there are no holes around the data, i.e. the size and the extent of this 
>> datatype are equal.
>>  - the resized one. This type resize the previous one by adding a hole in 
>> the beginning, thus it is not a contiguous type, even if the memory layout 
>> is in a single piece.
>> 
>> Now, let's go one step up in the ROMIO code attached to your previous email. 
>> You get the content of the main type, in this example RESIZED, and them the 
>> content of the internal type which is TYPE indexed. When you look if the 
>> internal type is contiguous, Open MPI answer yes as the indexed type has its 
>> extent equal to its size. While this is true, the fact that this type is 
>> resized make it non-contiguous, as by resizing it you explicitly alter the 
>> lower bound.
>> 
>> The fix you proposed in your last email (i.e. modify the ADIO is contig 
>> function) is a workaround. Let me think a little bit more about this. I'll 
>> be in right here, please read below...
>> 
>> If I read the MPI 2-2 standard in the Chapter about the Datatypes (page 87), 
>> at the section about the MPI_Type_indexed. I have the original typemap, i.e. 
>> the one for the MPI_CHAR type (char,0). When I create the indexed type I get 
>> the typemap (char, 1). Based on the definition of lower and upper bounds on 
>> the page 100, lb is equal to 1 and ub is equal to 2, which make the extent 
>> of the indexed type equal to 1. So far so good. Now let's look what the MPI 
>> standard says about having multiple of such datatype in an array, aka 
>> MPI_Type_contiguous based on your MPI_Type_indexed. As a reminder you 
>> indexed type has the typemap (char, 1) and the extent 1. Based on the 
>> definition of MPI_Type_contiguous on page 84, the typemap of the 
>> MPI_Type_contiguous( 4, your_indexed_type) is: (char,1), (char, 2), (char, 
>> 3), (char, 4) which as far as I can say it is __contiguous__. So the Open 
>> MPI ompi_datatype_is_contigous_memory correctly returns the fact that the 
>> resulting datatype even with a co!
>> unt greater than 1 is contiguous. Welcome to the fancy world of MPI 
>> datatypes.
>> 
>> Therefore, I think the Open MPI functions __really__ do the correct thing, 
>> and the problem is in the COMBINER_RESIZED code. As the datatype is 
>> explicitly resized by the user, you should not look if the previous type 
>> (types[0]) is contiguous or not, it doesn't matter as it was clearly 
>> resized. I wonder what the ROMIO developers had in mind for the 
>> ADIOI_Datatype_iscontig function, but it doesn't look like they just want to 
>> know if the content is contiguous. I guess this function return true if the 
>> content is contiguous AND the content start at the pointer position 
>> (displacement zero).
>> 
>>  george.
>> 
>> 
>> On Mar 19, 2010, at 06:14 , Pascal Deveze wrote:
>> 
>> 
>>> Hi George,
>>> 
>>> I went further on my investigations, and I found a solution.
>>> 
>>> ADIOI_Datatype_iscontig is defined in the file 
>>> ompi/mca/io/romio/src/io_romio_module.c as:
>>> 
>>> void ADIOI_Datatype_iscontig(MPI_Datatype datatype, int *flag)
>>> {
>>>  /*
>>>   * Open MPI contiguous check return true for datatype with
>>>   * gaps in the beginning and at the end. We have to provide
>>>   * a count of 2 in order to get these gaps taken into acount.
>>>   */
>>>  *flag = ompi_datatype_is_contiguous_memory_layout(datatype, 2);
>>> }
>>> 
>>> It is clearly written here that the gaps should be taken into account with 
>>> a count of 2. But that's not everytime the case.
>>> 
>>> Your proposition is to modify ROMIO code.
>>> So, I propose to fix ADIOI_Datatype_iscontig and add the following code 
>>> after the call
>>> to ompi_datatype_is_contiguous_memory_layout():
>>> 
>>>  if (*flag) {
>>>      MPI_Aint true_extent, true_lb;
>>> 
>>>      ompi_datatype_get_true_extent(datatype, &true_lb, &true_extent);
>>> 
>>>      if (true_lb > 0)
>>>          *flag = 0;
>>>  }
>>> 
>>> Regards,
>>> 
>>> Pascal
>>> 
>>> On Mar 18, 2010, at 13:24, George Bosilca wrote:
>>>  
>>>> We will disagree on that, but your datatype is contiguous. It doesn't 
>>>> matter that there are gaps in the beginning and at the end, as long as you 
>>>> only send one such datatype the real data that has to go over the network 
>>>> _is_ contiguous. And this is what the Open MPI datatype engine is 
>>>> reporting back.
>>>> 
>>>> Apparently, ROMIO expect a contiguous datatype to start from the position 
>>>> 0 relative to the beginning of the user buffer. I don't see why they have 
>>>> such a restrictive view, but I guess the original MPICH datatype engine 
>>>> was not able to distinguish between gaps in the middle and gaps at the 
>>>> beginning and the end of the datatype.
>>>> 
>>>> I don't see how to fix that in ROMIO code. But in case you plan to fix it, 
>>>> the correct solution is to retrieve the true lower bound of the datatype 
>>>> in the contiguous case and add it to st_offset.
>>>> 
>>>> george.
>>>> 
>>>> On Mar 18, 2010, at 12:27 , Pascal Deveze wrote:
>>>> 
>>>>    
>>>>> Hi all,
>>>>> 
>>>>> Sorry, I missed my porting from MPICH2 to OpenMPI concerning the file     
>>>>>     
>>>> romio/adio/comm/flatten.c
>>>>    
>>>>> (flatten.c in OpenMPI does not support MPI_COMBINER_RESIZED).
>>>>> 
>>>>> Here is the diff:
>>>>> 
>>>>> diff -u flatten.c flatten.c.old
>>>>> --- flatten.c 2010-03-18 17:07:43.000000000 +0100
>>>>> +++ flatten.c.old 2010-03-18 17:14:04.000000000 +0100
>>>>> @@ -525,44 +525,6 @@
>>>>> }
>>>>> break;
>>>>> - case MPI_COMBINER_RESIZED:
>>>>> - /* This is done similar to a type_struct with an lb, datatype, ub */
>>>>> -
>>>>> - /* handle the Lb */
>>>>> - j = *curr_index;
>>>>> - flat->indices[j] = st_offset + adds[0];
>>>>> - flat->blocklens[j] = 0;
>>>>> -
>>>>> - (*curr_index)++;
>>>>> -
>>>>> - /* handle the datatype */
>>>>> -
>>>>> - MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
>>>>> - &old_ntypes, &old_combiner);
>>>>> - ADIOI_Datatype_iscontig(types[0], &old_is_contig);
>>>>> -
>>>>> - if ((old_combiner != MPI_COMBINER_NAMED) && (!old_is_contig)) {
>>>>> - ADIOI_Flatten(types[0], flat, st_offset+adds[0], curr_index);
>>>>> - }
>>>>> - else {
>>>>> - /* current type is basic or contiguous */
>>>>> - j = *curr_index;
>>>>> - flat->indices[j] = st_offset;
>>>>> - MPI_Type_size(types[0], (int*)&old_size);
>>>>> - flat->blocklens[j] = old_size;
>>>>> -
>>>>> - (*curr_index)++;
>>>>> - }
>>>>> -
>>>>> - /* take care of the extent as a UB */
>>>>> - j = *curr_index;
>>>>> - flat->indices[j] = st_offset + adds[0] + adds[1];
>>>>> - flat->blocklens[j] = 0;
>>>>> -
>>>>> - (*curr_index)++;
>>>>> -
>>>>> - break;
>>>>> -
>>>>> default:
>>>>> /* TODO: FIXME (requires changing prototypes to return errors...) */
>>>>> FPRINTF(stderr, "Error: Unsupported datatype passed to ADIOI_Flatten\n");
>>>>> @@ -827,29 +789,6 @@
>>>>> }
>>>>> }
>>>>> break;
>>>>> -
>>>>> - case MPI_COMBINER_RESIZED:
>>>>> - /* treat it as a struct with lb, type, ub */
>>>>> -
>>>>> - /* add 2 for lb and ub */
>>>>> - (*curr_index) += 2;
>>>>> - count += 2;
>>>>> -
>>>>> - /* add for datatype */
>>>>> - MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
>>>>> - &old_ntypes, &old_combiner);
>>>>> - ADIOI_Datatype_iscontig(types[0], &old_is_contig);
>>>>> -
>>>>> - if ((old_combiner != MPI_COMBINER_NAMED) && (!old_is_contig)) {
>>>>> - count += ADIOI_Count_contiguous_blocks(types[0], curr_index);
>>>>> - }
>>>>> - else {
>>>>> - /* basic or contiguous type */
>>>>> - count++;
>>>>> - (*curr_index)++;
>>>>> - }
>>>>> - break;
>>>>> -
>>>>> default:
>>>>> /* TODO: FIXME */
>>>>> FPRINTF(stderr, "Error: Unsupported datatype passed to         
>>>> ADIOI_Count_contiguous_blocks, combiner = %d\n", combiner);
>>>>    
>>>>> Regards,
>>>>> 
>>>>> Pascal
>>>>> 
>>>>> Pascal Deveze a ?crit :
>>>>>      
>>>>>> Hi all,
>>>>>> 
>>>>>> I use a very simple datatype defined as follow:
>>>>>> lng[0]= 1;
>>>>>> dsp[0]= 1;
>>>>>> err=MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
>>>>>> err=MPI_Type_create_resized(offtype, 0, 2, &filetype);
>>>>>> MPI_Type_commit(&filetype);
>>>>>> 
>>>>>> This datatype consists of a hole (of length 1 char) followed by a char.
>>>>>> 
>>>>>> The datatype with hole at the beginning is not correctly handled by      
>>>>>>      
>>>> ROMIO integrated in OpenMPI (I tried with MPICH2 and it worked fine).
>>>>    
>>>>>> You will see bellow a program to reproduce the problem.
>>>>>> 
>>>>>> After investigations, I see that the difference between OpenMPI and      
>>>>>>      
>>>> MPICH appears at line 542 in the file romio/adio/comm/flatten.c:
>>>>    
>>>>>> case MPI_COMBINER_RESIZED:
>>>>>> /* This is done similar to a type_struct with an lb, datatype, ub */
>>>>>> 
>>>>>> /* handle the Lb */
>>>>>> j = *curr_index;
>>>>>> flat->indices[j] = st_offset + adds[0];
>>>>>> flat->blocklens[j] = 0;
>>>>>> 
>>>>>> (*curr_index)++;
>>>>>> 
>>>>>> /* handle the datatype */
>>>>>> 
>>>>>> MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
>>>>>> &old_ntypes, &old_combiner);
>>>>>> ADIOI_Datatype_iscontig(types[0], &old_is_contig); <========== ligne 542
>>>>>> 
>>>>>> For MPICH2, the datatype is not contiguous, but it is for OpenMPI.       
>>>>>>     
>>>> The routine ADIOI_Datatype_iscontig is
>>>>    
>>>>>> quite different in OpenMPI because the datatypes are handled very        
>>>>>>    
>>>> differently. If I reset old_is_contig just after
>>>>    
>>>>>> line 542, the problem disappears (Of course, this is not a solution).
>>>>>> 
>>>>>> I am not able to propose a right solution. Can somebody help ?
>>>>>> 
>>>>>> Pascal
>>>>>> 
>>>>>> ============ Program to reproduce the problem ========
>>>>>> #include <stdio.h>
>>>>>> #include "mpi.h"
>>>>>> 
>>>>>> char filename[256]="VIEW_TEST";
>>>>>> char buffer[100];
>>>>>> int err, i, myid, dsp[3], lng[3];
>>>>>> MPI_Status status;
>>>>>> MPI_File fh;
>>>>>> MPI_Datatype filetype, offtype;
>>>>>> MPI_Aint lb, extent;
>>>>>> 
>>>>>> int main(int argc, char **argv) {
>>>>>> 
>>>>>> MPI_Init(&argc, &argv);
>>>>>> MPI_Comm_rank(MPI_COMM_WORLD, &myid);
>>>>>> for (i=0; i<sizeof(buffer); i++) buffer[i] = i;
>>>>>> 
>>>>>> if (!myid) {
>>>>>> MPI_File_open(MPI_COMM_SELF, filename, MPI_MODE_CREATE |           
>>>> MPI_MODE_RDWR , MPI_INFO_NULL, &fh);
>>>>    
>>>>>> MPI_File_write(fh, buffer, sizeof(buffer), MPI_CHAR, &status);
>>>>>> MPI_File_close(&fh);
>>>>>> 
>>>>>> lng[0]= 1;
>>>>>> dsp[0]= 1;
>>>>>> MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
>>>>>> MPI_Type_create_resized(offtype, 0, 2, &filetype);
>>>>>> MPI_Type_commit(&filetype);
>>>>>> 
>>>>>> MPI_File_open(MPI_COMM_SELF, filename, MPI_MODE_RDONLY ,           
>>>> MPI_INFO_NULL, &fh);
>>>>    
>>>>>> MPI_File_set_view(fh, 0, MPI_CHAR, filetype,"native", MPI_INFO_NULL);
>>>>>> MPI_File_read(fh, buffer, 5, MPI_CHAR, &status);
>>>>>> 
>>>>>> printf("Data: ");
>>>>>> for (i=0 ; i<5 ; i++) printf(" %x ", buffer[i]);
>>>>>> if (buffer[1] != 3) printf("\n =======> test KO : buffer[1]=%d           
>>>> instead of %d \n", buffer[1], 4);
>>>>    
>>>>>> else printf("\n =======> test OK\n");
>>>>>> MPI_Type_free(&filetype);
>>>>>> MPI_File_close(&fh);
>>>>>> }
>>>>>> MPI_Barrier(MPI_COMM_WORLD);
>>>>>> MPI_Finalize();
>>>>>> }
>>>>>> ============ The result of the program with MPICH2 ========
>>>>>> Data: 1 3 5 7 9
>>>>>> =======> test OK
>>>>>> 
>>>>>> ============ The result of the program with OpenMPI ========
>>>>>> Data: 0 2 4 6 8
>>>>>> =======> test KO : buffer[1]=2 instead of 4
>>>>>> 
>>>>>> Comment: Only the first hole is ommited.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>          
>>>>      
>>> __
> 
> 
> 
> 


Reply via email to