George, The BLACS test code was actually calling MPI_Pack to pack the data into a contigous buffer, and then called MPI_ISend w/ datatype of PACKED. So, the convertor used by the PML/BTLs treated this as contiguous data, and allowed the PML/BTL to split it however they liked...
Your fix should correct this, as a single convertor is used on each side for pack/unpack. This will also help w/ the buffered send case, which essentially did the same. Thanks! Tim > I fix the problem we had with BLACS. As it look like everybody > believe it was a data-type issue I fix it in the DDT engine. However, > as I explain this morning on the phone conference (and nobody believe > it) the problem was triggered by the way the convertor was used. For > me it's an easy fix at the DDT layer that will allow BTL developers > to pay less attention to the way they pack/unpack data ... but it is > not the way the DDT was designed. > > Here is the explanation of what was wrong inside: > BLACS create a triangular matrix using an indexed type. The memory > layout of this data-type is composed by several contiguous buffers > with some gaps in between. The problem we had was the following: > 1. on the sender size pack was called with a buffer large enough to > hold all the data. > 2. on the receiver side the unpack was called twice with different > iovecs. Even if the total length of the 2 iovec was the correct > length it happen that the length of the first one was too short > making the convertor to stop in the middle of a basic type. And that > was not the way the convertor was designed to work. > > Here are the output of the ddt engine for SM. > > First the pack side: > > [applebasket.cs.utk.edu:16760] ompi_convertor_generic_simple_pack > ( 0xbfffc104, {0x2811430, 4560}, 1 ) > [applebasket.cs.utk.edu:16760] unpack start pos_desc 0 count_desc 6 > disp 0 > stack_pos 0 pos_desc -1 count_desc 1 disp 0 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811430, 0xac650, > 96 ) => space 4560 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811490, 0xac7e0, > 112 ) => space 4464 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811500, 0xac970, > 128 ) => space 4352 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811580, 0xacb00, > 144 ) => space 4224 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811610, 0xacc90, > 160 ) => space 4080 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28116b0, 0xace20, > 176 ) => space 3920 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811760, 0xacfb0, > 192 ) => space 3744 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811820, 0xad140, > 208 ) => space 3552 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28118f0, 0xad2d0, > 224 ) => space 3344 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28119d0, 0xad460, > 240 ) => space 3120 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811ac0, 0xad5f0, > 256 ) => space 2880 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811bc0, 0xad780, > 272 ) => space 2624 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811cd0, 0xad910, > 288 ) => space 2352 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811df0, 0xadaa0, > 304 ) => space 2064 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811f20, 0xadc30, > 320 ) => space 1760 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812060, 0xaddc0, > 336 ) => space 1440 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28121b0, 0xadf50, > 352 ) => space 1104 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812310, 0xae0e0, > 368 ) => space 752 > [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812480, 0xae270, > 384 ) => space 384 > [applebasket.cs.utk.edu:16760] pack end_loop count 1 stack_pos 0 > pos_desc 19 disp 0 space 0 > > As you can see there is one pack operation with a buffer of 4560 > bytes ... exactly the size of the whole data. Even if the pack pay > attention to not cut a basic type in the middle, in this particular > case it has enough data to do it's job correctly. > > The receiver side look a little bit different: > > [applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack > ( 0x280bf04, {0x229e15c, 956}, 1 ) > [applebasket.cs.utk.edu:16758] unpack start pos_desc 0 count_desc 6 > disp 0 > stack_pos 0 pos_desc -1 count_desc 1 disp 0 > [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac650, 0x229e15c, > 96 ) => space 956 > [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac7e0, 0x229e1bc, > 112 ) => space 860 > [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac970, 0x229e22c, > 128 ) => space 748 > [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacb00, 0x229e2ac, > 144 ) => space 620 > [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacc90, 0x229e33c, > 160 ) => space 476 > [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xace20, 0x229e3dc, > 176 ) => space 316 > [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacfb0, 0x229e48c, > 128 ) => space 140 > [applebasket.cs.utk.edu:16758] Losing 12 bytes !!! > [applebasket.cs.utk.edu:16758] unpack save stack stack_pos 1 pos_desc > 6 count_desc 4 disp 128 > [applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack > ( 0x280bf04, {0x229e158, 3604}, 1 ) > [applebasket.cs.utk.edu:16758] unpack start pos_desc 6 count_desc 4 > disp 128 > stack_pos 0 pos_desc -1 count_desc 1 disp 0 > [applebasket.cs.utk.edu:16758] unpack pending from the last unpack 12 > out of 16 bytes > [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xad030, 0x280bf4c, > 16 ) => space 16 > ... (skipped) > > We can see the trace of 2 unpack operations, one with a size of 956 > bytes and the other with 3604. In the middle of the previous text you > can notice the "Losing 12 bytes !!!" message. The basic type here is > a long double (16 bytes on this machine) so we definitively stop in > the middle of a basic type. > > A correct usage of the convertor could prevent such problems. Anyway, > now the convertor will remember such kind of errors and will > automatically correct them (the cost is just an if in the critical > path and some extra memory in the convertor struct). > > george. > > "Half of what I say is meaningless; but I say it so that the other > half may reach you" > Kahlil Gibran > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >