On Mon, Aug 11, 2014 at 10:41 AM, Rob Latham <r...@mcs.anl.gov> wrote:

>
>
> On 08/11/2014 08:54 AM, George Bosilca wrote:
>
>> The patch related to ticket #4597 is zapping only the datatypes where
>> the user explicitly provided a zero count.
>>
>> We can argue about LB and UB, but I have a hard time understanding the
>> rationale of allowing zero count only for LB and UB. If it is required
>> by the standard we can easily support it (the line in the patch has to
>> move a little down in the code).
>>
>
> ROMIO's type flattening code is primitive: the zero-length blocks for UB
> and LB were the only way to encode the extent of the type, without calling
> back into the MPI implementation's type-inquiry routines.
>
>
> *I* don't  care how OpenMPI deals with UB and LB.  It was *you* who
> suggested one might need to look a bit more closely at how OpenMPI type
> processing handles those markers:
>
> http://www.open-mpi.org/community/lists/users/2014/05/24325.php


I have absolutely no issue with this approach. I was basically trying to
figure out if the ticket was closed too early or not.

  George.



>
>
> ==rob
>
>
>>    George.
>>
>>
>>
>> On Mon, Aug 11, 2014 at 9:44 AM, Rob Latham <r...@mcs.anl.gov
>> <mailto:r...@mcs.anl.gov>> wrote:
>>
>>
>>
>>     On 08/10/2014 07:32 PM, Mohamad Chaarawi wrote:
>>
>>         Update:
>>
>>         George suggested that I try with the 1.8.2 rc3 and that one
>>         resolves the
>>         hindexed_block segfault that I was seeing with ompi. the I/O
>>         part now
>>         works with ompio, but needs the patches from Rob in ROMIO to
>>         work correctly.
>>
>>         The 2nd issue with collective I/O where some processes
>>         participate with
>>         0 sized datatypes created with hindexed and hvector, is still
>>         unresolved.
>>
>>
>>     I think this ticket was closed a bit too early:
>>
>>     https://svn.open-mpi.org/trac/__ompi/ticket/4597
>>
>>     <https://svn.open-mpi.org/trac/ompi/ticket/4597>
>>
>>     I don't know OpenMPI's type processing at all, but if it's like
>>     ROMIO, you cannot simply zap blocks of zero length:  some zero
>>     length blocks indicate upper bound and lower bound.
>>
>>     or maybe it's totally unrelated.  There were a flurry of datatype
>>     bugs reported against both MPICH and OpenMPI in may of this year and
>>     I am sure I am confusing several issues.
>>
>>     ==rob
>>
>>
>>         Thanks,
>>         Mohamad
>>
>>         On 8/6/2014 11:50 AM, Mohamad Chaarawi wrote:
>>
>>             Hi all,
>>
>>             I'm seeing some problems with dervided datatype construction
>>             and I/O
>>             with OpenMPI 1.8.1.
>>
>>             I have replicated them in the attached program.
>>             The first issue is that MPI_Type_create_hindexed___block()
>>
>>             always
>>             sefgaults. Usage of this routine is commented out in the
>>             program. (I
>>             have a separate email thread with George and Edgar about
>> this).
>>
>>             The other issue is a segfault in MPI_File_set_view, when I
>>             have ranks
>>              > 0 creating the derived datatypes with count 0, and rank 0
>>             creating a
>>             derived datatype of count NUM_BLOCKS. If I use
>>             MPI_Type_contiguous to
>>             create the 0 sized file and memory datatypes (instead of
>>             hindexed and
>>             hvector) it works fine.
>>             To replicate, run the program with 2 or more procs:
>>
>>             mpirun -np 2 ./hindexed_io mpi_test_file
>>
>>             [jam:15566] *** Process received signal ***
>>             [jam:15566] Signal: Segmentation fault (11)
>>             [jam:15566] Signal code: Address not mapped (1)
>>             [jam:15566] Failing at address: (nil)
>>             [jam:15566] [ 0] [0xfcd440]
>>             [jam:15566] [ 1]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(ADIOI_Flatten_
>> __datatype+0x17a)[0xc80f2a]
>>             [jam:15566] [ 2]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(ADIO_Set_view+
>> __0x1c1)[0xc72a6d]
>>             [jam:15566] [ 3]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(mca_io_romio__
>> _dist_MPI_File_set_view+0x69b)[__0xc8d11b]
>>             [jam:15566] [ 4]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(mca_io_romio__
>> _file_set_view+0x7c)[0xc4f7c5]
>>             [jam:15566] [ 5]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(PMPI_File_set_
>> __view+0x1e6)[0xb32f7e]
>>
>>             [jam:15566] [ 6] ./hindexed_io[0x8048aa6]
>>             [jam:15566] [ 7]
>>             /lib/libc.so.6(__libc_start___main+0xdc)[0x7d5ebc]
>>
>>             [jam:15566] [ 8] ./hindexed_io[0x80487e1]
>>             [jam:15566] *** End of error message ***
>>
>>             If I use --mca io ompio with 2 or more procs, the program
>>             segfaults in
>>             write_at_all (regardless of what routine is used to
>>             construct a 0
>>             sized datatype):
>>
>>             [jam:15687] *** Process received signal ***
>>             [jam:15687] Signal: Floating point exception (8)
>>             [jam:15687] Signal code: Integer divide-by-zero (1)
>>             [jam:15687] Failing at address: 0x3e29b7
>>             [jam:15687] [ 0] [0xe56440]
>>             [jam:15687] [ 1]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(ompi_io_ompio_
>> __set_explicit_offset+0x9d)[__0x3513bc]
>>             [jam:15687] [ 2]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(ompio_io___
>> ompio_file_write_at_all+0x3e)[__0x35869a]
>>             [jam:15687] [ 3]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(mca_io_ompio__
>> _file_write_at_all+0x66)[__0x358650]
>>             [jam:15687] [ 4]
>>             /scr/chaarawi/install/ompi/__lib/libmpi.so.1(MPI_File___
>> write_at_all+0x1b3)[0x1f46f3]
>>
>>             [jam:15687] [ 5] ./hindexed_io[0x8048b07]
>>             [jam:15687] [ 6]
>>             /lib/libc.so.6(__libc_start___main+0xdc)[0x7d5ebc]
>>
>>             [jam:15687] [ 7] ./hindexed_io[0x80487e1]
>>             [jam:15687] *** End of error message ***
>>
>>             If I use mpich 3.1.2 , I don't see those issues.
>>
>>             Thanks,
>>             Mohamad
>>
>>
>>             _________________________________________________
>>             users mailing list
>>             us...@open-mpi.org <mailto:us...@open-mpi.org>
>>             Subscription:http://www.open-__mpi.org/mailman/listinfo.cgi/
>> __users
>>
>>             <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>             Link to this
>>             post:http://www.open-mpi.org/__community/lists/users/2014/
>> 08/__24931.php
>>             <http://www.open-mpi.org/community/lists/users/2014/08/
>> 24931.php>
>>
>>
>>
>>
>>         _________________________________________________
>>         users mailing list
>>         us...@open-mpi.org <mailto:us...@open-mpi.org>
>>         Subscription:
>>         http://www.open-mpi.org/__mailman/listinfo.cgi/users
>>
>>         <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>         Link to this post:
>>         http://www.open-mpi.org/__community/lists/users/2014/08/
>> __24963.php
>>
>>         <http://www.open-mpi.org/community/lists/users/2014/08/24963.php>
>>
>>
>>     --
>>     Rob Latham
>>     Mathematics and Computer Science Division
>>     Argonne National Lab, IL USA
>>     _________________________________________________
>>     users mailing list
>>     us...@open-mpi.org <mailto:us...@open-mpi.org>
>>     Subscription: http://www.open-mpi.org/__mailman/listinfo.cgi/users
>>
>>     <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>     Link to this post:
>>     http://www.open-mpi.org/__community/lists/users/2014/08/__24971.php
>>     <http://www.open-mpi.org/community/lists/users/2014/08/24971.php>
>>
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: http://www.open-mpi.org/community/lists/users/2014/08/
>> 24973.php
>>
>>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: http://www.open-mpi.org/community/lists/users/2014/08/
> 24974.php
>

Reply via email to