Hi,

I've found that
commit b531973419a056696e6f88d813769aa4f1f1aee6 doesn't work
Author: Jeff Squyres <jsquy...@open-mpi-git-mirror.example.com>
List-Post: devel@lists.open-mpi.org
Date:   Tue Apr 22 19:48:56 2014 +0000

caused new failures with derived datatypes. Collectives return incorrect
results.
But it doesn't reproduce on a regular manner. Need many iterations to
reproduce.
Examples are:
- gather with short_int datatype in amount of 20000 and in-place is on,
- allgather with a random generated derived datatype in amount of 20000.

It leads to the following output (this is a little fragment):

 ../../../opal/datatype/opal_datatype_position.c:72
Pointer 0x7fff8119c11d size 16 is outside [0x7fff80cd6040,0x7fff80f4702d]
for
 base ptr 0x7fff80cd6040 count 40000 and data
 Datatype 0xa72080[] size 38 align 16 id 0 length 7 used 6
true_lb 0 true_ub 45 (true_extent 45) lb 0 ub 64 (extent 64)
nbElems 6 loops 0 flags 104 (commited )-c-----GD--[---][---]
   contain OPAL_INT2 OPAL_INT4 OPAL_UINT8 OPAL_FLOAT4 OPAL_FLOAT16
--C---P-D--[---][---]     OPAL_UINT8 count 1 disp 0x0 (0) extent 8 (size 8)
--C---P-D--[---][---]    OPAL_FLOAT4 count 1 disp 0xa (10) extent 4 (size 4)
--C---P-D--[---][---]      OPAL_INT4 count 1 disp 0xe (14) extent 4 (size 4)
--C---P-D--[---][---]      OPAL_INT2 count 1 disp 0x17 (23) extent 2 (size
2)
--C---P-D--[---][---]   OPAL_FLOAT16 count 1 disp 0x19 (25) extent 16 (size
16)
--C---P-D--[---][---]      OPAL_INT4 count 1 disp 0x29 (41) extent 4 (size
4)
-------G---[---][---]  OPAL_END_LOOP prev 6 elements first elem
displacement 0 size of data 38
Optimized description
-cC---P-DB-[---][---]     OPAL_UINT8 count 1 disp 0x0 (0) extent 8 (size 8)
-cC---P-DB-[---][---]     OPAL_UINT1 count 8 disp 0xa (10) extent 1 (size 8)
-cC---P-DB-[---][---]     OPAL_UINT1 count 22 disp 0x17 (23) extent 1 (size
22)
-------G---[---][---]  OPAL_END_LOOP prev 3 elements first elem
displacement 0 size of data 38

Best regards,
Elena

Reply via email to