[O-MPI devel] Open MPI @SC2005

2005-11-08 Thread Jeff Squyres

Greetings all!

The Open MPI Team will be at SC this year.  If you're attending, please 
feel free to drop by any of the following booths to say hi and discuss 
Open MPI's current status, future directions, and potentials for 
collaboration:


- Indiana University (booth #202)
- Los Alamos National Laboratory (booth #312)
- Oak Ridge National Laboratory (booth #2226, look for the U. Tennessee 
section)

- HLRS (booth #2209, #2239)

I am co-hosting a BOF entitled "Why MPI Makes You Scream!  And how can 
we simplify parallel debugging?" at 12:15pm on Thursday (see 
http://sc05.supercomputing.org/schedule/event_detail.php?evid=5240).  
Please feel free to stop by and join in the discussion.


We're also giving several short talks about Open MPI at the Indiana 
University booth:


"Introduction and Overview of Open MPI"
Jeff Squyres (Indiana University)
Mon 8:00pm / Tue 2:00pm
"Advanced Point-to-Point Architecture in Open MPI"
Tim Woodall (Los Alamos National Laboratory)
Mon 8:15pm / Tue 2:15pm
"Datatypes, Fault Tolerance, and Other Cool Stuff in Open MPI"
George Bosilca (University of Tennessee)
Mon 8:30pm / Tue 2:30pm
"Tuning Collective Communication: Managing the choices"
Graham Fagg (University of Tennessee)
Mon 8:45pm / Tue 2:45pm

Finally, I'll also be giving a short talk about Open MPI in conjunction 
with AIST at their booth (#723) at 3pm on Wednesday.


Hope to see you there!

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/



[O-MPI devel] data-type engine

2005-11-08 Thread George Bosilca
I fix the problem we had with BLACS. As it look like everybody  
believe it was a data-type issue I fix it in the DDT engine. However,  
as I explain this morning on the phone conference (and nobody believe  
it) the problem was triggered by the way the convertor was used. For  
me it's an easy fix at the DDT layer that will allow BTL developers  
to pay less attention to the way they pack/unpack data ... but it is  
not the way the DDT was designed.


Here is the explanation of what was wrong inside:
BLACS create a triangular matrix using an indexed type. The memory  
layout of this data-type is composed by several contiguous buffers  
with some gaps in between. The problem we had was the following:
1. on the sender size pack was called with a buffer large enough to  
hold all the data.
2. on the receiver side the unpack was called twice with different  
iovecs. Even if the total length of the 2 iovec was the correct  
length it happen that the length of the first one was too short  
making the convertor to stop in the middle of a basic type. And that  
was not the way the convertor was designed to work.


Here are the output of the ddt engine for SM.

First the pack side:

[applebasket.cs.utk.edu:16760] ompi_convertor_generic_simple_pack 
( 0xbfffc104, {0x2811430, 4560}, 1 )
[applebasket.cs.utk.edu:16760] unpack start pos_desc 0 count_desc 6  
disp 0

stack_pos 0 pos_desc -1 count_desc 1 disp 0
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811430, 0xac650,  
96 ) => space 4560
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811490, 0xac7e0,  
112 ) => space 4464
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811500, 0xac970,  
128 ) => space 4352
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811580, 0xacb00,  
144 ) => space 4224
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811610, 0xacc90,  
160 ) => space 4080
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28116b0, 0xace20,  
176 ) => space 3920
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811760, 0xacfb0,  
192 ) => space 3744
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811820, 0xad140,  
208 ) => space 3552
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28118f0, 0xad2d0,  
224 ) => space 3344
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28119d0, 0xad460,  
240 ) => space 3120
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811ac0, 0xad5f0,  
256 ) => space 2880
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811bc0, 0xad780,  
272 ) => space 2624
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811cd0, 0xad910,  
288 ) => space 2352
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811df0, 0xadaa0,  
304 ) => space 2064
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811f20, 0xadc30,  
320 ) => space 1760
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812060, 0xaddc0,  
336 ) => space 1440
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28121b0, 0xadf50,  
352 ) => space 1104
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812310, 0xae0e0,  
368 ) => space 752
[applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812480, 0xae270,  
384 ) => space 384
[applebasket.cs.utk.edu:16760] pack end_loop count 1 stack_pos 0  
pos_desc 19 disp 0 space 0


As you can see there is one pack operation with a buffer of 4560  
bytes ... exactly the size of the whole data. Even if the pack pay  
attention to not cut a basic type in the middle, in this particular  
case it has enough data to do it's job correctly.


The receiver side look a little bit different:

[applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack 
( 0x280bf04, {0x229e15c, 956}, 1 )
[applebasket.cs.utk.edu:16758] unpack start pos_desc 0 count_desc 6  
disp 0

stack_pos 0 pos_desc -1 count_desc 1 disp 0
[applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac650, 0x229e15c,  
96 ) => space 956
[applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac7e0, 0x229e1bc,  
112 ) => space 860
[applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac970, 0x229e22c,  
128 ) => space 748
[applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacb00, 0x229e2ac,  
144 ) => space 620
[applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacc90, 0x229e33c,  
160 ) => space 476
[applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xace20, 0x229e3dc,  
176 ) => space 316
[applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacfb0, 0x229e48c,  
128 ) => space 140

[applebasket.cs.utk.edu:16758] Losing 12 bytes !!!
[applebasket.cs.utk.edu:16758] unpack save stack stack_pos 1 pos_desc  
6 count_desc 4 disp 128
[applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack 
( 0x280bf04, {0x229e158, 3604}, 1 )
[applebasket.cs.utk.edu:16758] unpack start pos_desc 6 count_desc 4  
disp 128

stack_pos 0 pos_desc -1 count_desc 1 disp 0
[applebasket.cs.utk.edu:16758] unpack pending from the last unpack 12  
out of 16 bytes
[applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xad030, 0x280bf4c,  
16 ) => space 16

... (skipped)

We can see the trace of 2 unpack operations, one with a size of 9

Re: [O-MPI devel] data-type engine

2005-11-08 Thread Timothy S. Woodall
George,

As I indicated this morning, what you are describing is not the
correct behaviour of the PML/BTL's. Again, if you can provide me a
simple test case to duplicate this, I'd be glad to look at it.

Tim



> I fix the problem we had with BLACS. As it look like everybody
> believe it was a data-type issue I fix it in the DDT engine. However,
> as I explain this morning on the phone conference (and nobody believe
> it) the problem was triggered by the way the convertor was used. For
> me it's an easy fix at the DDT layer that will allow BTL developers
> to pay less attention to the way they pack/unpack data ... but it is
> not the way the DDT was designed.
>
> Here is the explanation of what was wrong inside:
> BLACS create a triangular matrix using an indexed type. The memory
> layout of this data-type is composed by several contiguous buffers
> with some gaps in between. The problem we had was the following:
> 1. on the sender size pack was called with a buffer large enough to
> hold all the data.
> 2. on the receiver side the unpack was called twice with different
> iovecs. Even if the total length of the 2 iovec was the correct
> length it happen that the length of the first one was too short
> making the convertor to stop in the middle of a basic type. And that
> was not the way the convertor was designed to work.
>
> Here are the output of the ddt engine for SM.
>
> First the pack side:
>
> [applebasket.cs.utk.edu:16760] ompi_convertor_generic_simple_pack
> ( 0xbfffc104, {0x2811430, 4560}, 1 )
> [applebasket.cs.utk.edu:16760] unpack start pos_desc 0 count_desc 6
> disp 0
> stack_pos 0 pos_desc -1 count_desc 1 disp 0
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811430, 0xac650,
> 96 ) => space 4560
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811490, 0xac7e0,
> 112 ) => space 4464
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811500, 0xac970,
> 128 ) => space 4352
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811580, 0xacb00,
> 144 ) => space 4224
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811610, 0xacc90,
> 160 ) => space 4080
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28116b0, 0xace20,
> 176 ) => space 3920
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811760, 0xacfb0,
> 192 ) => space 3744
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811820, 0xad140,
> 208 ) => space 3552
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28118f0, 0xad2d0,
> 224 ) => space 3344
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28119d0, 0xad460,
> 240 ) => space 3120
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811ac0, 0xad5f0,
> 256 ) => space 2880
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811bc0, 0xad780,
> 272 ) => space 2624
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811cd0, 0xad910,
> 288 ) => space 2352
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811df0, 0xadaa0,
> 304 ) => space 2064
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811f20, 0xadc30,
> 320 ) => space 1760
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812060, 0xaddc0,
> 336 ) => space 1440
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28121b0, 0xadf50,
> 352 ) => space 1104
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812310, 0xae0e0,
> 368 ) => space 752
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812480, 0xae270,
> 384 ) => space 384
> [applebasket.cs.utk.edu:16760] pack end_loop count 1 stack_pos 0
> pos_desc 19 disp 0 space 0
>
> As you can see there is one pack operation with a buffer of 4560
> bytes ... exactly the size of the whole data. Even if the pack pay
> attention to not cut a basic type in the middle, in this particular
> case it has enough data to do it's job correctly.
>
> The receiver side look a little bit different:
>
> [applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack
> ( 0x280bf04, {0x229e15c, 956}, 1 )
> [applebasket.cs.utk.edu:16758] unpack start pos_desc 0 count_desc 6
> disp 0
> stack_pos 0 pos_desc -1 count_desc 1 disp 0
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac650, 0x229e15c,
> 96 ) => space 956
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac7e0, 0x229e1bc,
> 112 ) => space 860
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac970, 0x229e22c,
> 128 ) => space 748
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacb00, 0x229e2ac,
> 144 ) => space 620
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacc90, 0x229e33c,
> 160 ) => space 476
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xace20, 0x229e3dc,
> 176 ) => space 316
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacfb0, 0x229e48c,
> 128 ) => space 140
> [applebasket.cs.utk.edu:16758] Losing 12 bytes !!!
> [applebasket.cs.utk.edu:16758] unpack save stack stack_pos 1 pos_desc
> 6 count_desc 4 disp 128
> [applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack
> ( 0x280bf04, {0x229e158, 3604}, 1 )
> [applebasket.cs.utk.edu:16758] unpack start pos_desc 6 count_desc 4
> disp 

Re: [O-MPI devel] data-type engine

2005-11-08 Thread Timothy S. Woodall
George,

The BLACS test code was actually calling MPI_Pack to pack the data
into a contigous buffer, and then called MPI_ISend w/ datatype
of PACKED. So, the convertor used by the PML/BTLs treated this as
contiguous data, and allowed the PML/BTL to split it however they
liked...

Your fix should correct this, as a single convertor is used on each
side for pack/unpack. This will also help w/ the buffered send case,
which essentially did the same.

Thanks!
Tim


> I fix the problem we had with BLACS. As it look like everybody
> believe it was a data-type issue I fix it in the DDT engine. However,
> as I explain this morning on the phone conference (and nobody believe
> it) the problem was triggered by the way the convertor was used. For
> me it's an easy fix at the DDT layer that will allow BTL developers
> to pay less attention to the way they pack/unpack data ... but it is
> not the way the DDT was designed.
>
> Here is the explanation of what was wrong inside:
> BLACS create a triangular matrix using an indexed type. The memory
> layout of this data-type is composed by several contiguous buffers
> with some gaps in between. The problem we had was the following:
> 1. on the sender size pack was called with a buffer large enough to
> hold all the data.
> 2. on the receiver side the unpack was called twice with different
> iovecs. Even if the total length of the 2 iovec was the correct
> length it happen that the length of the first one was too short
> making the convertor to stop in the middle of a basic type. And that
> was not the way the convertor was designed to work.
>
> Here are the output of the ddt engine for SM.
>
> First the pack side:
>
> [applebasket.cs.utk.edu:16760] ompi_convertor_generic_simple_pack
> ( 0xbfffc104, {0x2811430, 4560}, 1 )
> [applebasket.cs.utk.edu:16760] unpack start pos_desc 0 count_desc 6
> disp 0
> stack_pos 0 pos_desc -1 count_desc 1 disp 0
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811430, 0xac650,
> 96 ) => space 4560
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811490, 0xac7e0,
> 112 ) => space 4464
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811500, 0xac970,
> 128 ) => space 4352
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811580, 0xacb00,
> 144 ) => space 4224
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811610, 0xacc90,
> 160 ) => space 4080
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28116b0, 0xace20,
> 176 ) => space 3920
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811760, 0xacfb0,
> 192 ) => space 3744
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811820, 0xad140,
> 208 ) => space 3552
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28118f0, 0xad2d0,
> 224 ) => space 3344
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28119d0, 0xad460,
> 240 ) => space 3120
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811ac0, 0xad5f0,
> 256 ) => space 2880
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811bc0, 0xad780,
> 272 ) => space 2624
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811cd0, 0xad910,
> 288 ) => space 2352
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811df0, 0xadaa0,
> 304 ) => space 2064
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811f20, 0xadc30,
> 320 ) => space 1760
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812060, 0xaddc0,
> 336 ) => space 1440
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28121b0, 0xadf50,
> 352 ) => space 1104
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812310, 0xae0e0,
> 368 ) => space 752
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812480, 0xae270,
> 384 ) => space 384
> [applebasket.cs.utk.edu:16760] pack end_loop count 1 stack_pos 0
> pos_desc 19 disp 0 space 0
>
> As you can see there is one pack operation with a buffer of 4560
> bytes ... exactly the size of the whole data. Even if the pack pay
> attention to not cut a basic type in the middle, in this particular
> case it has enough data to do it's job correctly.
>
> The receiver side look a little bit different:
>
> [applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack
> ( 0x280bf04, {0x229e15c, 956}, 1 )
> [applebasket.cs.utk.edu:16758] unpack start pos_desc 0 count_desc 6
> disp 0
> stack_pos 0 pos_desc -1 count_desc 1 disp 0
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac650, 0x229e15c,
> 96 ) => space 956
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac7e0, 0x229e1bc,
> 112 ) => space 860
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac970, 0x229e22c,
> 128 ) => space 748
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacb00, 0x229e2ac,
> 144 ) => space 620
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacc90, 0x229e33c,
> 160 ) => space 476
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xace20, 0x229e3dc,
> 176 ) => space 316
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacfb0, 0x229e48c,
> 128 ) => space 140
> [applebasket.cs.utk.edu:16758] Losing 12 bytes !!!
> [applebasket.cs.utk.edu:167