Hah. Thanks for catching that. I will commit your patch later today. -Nathan ________________________________________ From: devel [devel-boun...@open-mpi.org] on behalf of Gilles Gouaillardet [gilles.gouaillar...@iferc.org] Sent: Monday, May 12, 2014 4:42 AM To: Open MPI Developers Subject: Re: [OMPI devel] scif btl side effects
i wrote this too early ... the attached program produces incorrect results when ran with --mca btl scif,vader,self once the most up-to-date patch of #4610 has been applied, (at least) one bug remain, and it is in the scif btl the attached patch fixes it. Gilles On 2014/05/12 16:17, Gilles Gouaillardet wrote: > Nathan, > > On 2014/05/08 4:21, Hjelm, Nathan T wrote: >> c) that being said, that should work so there is a bug >> d) there is a regression in v1.8 and a bug that might have been always here >> This is probably not a regression. The SCIF btl has been part of the 1.7 >> series for some time. The nightly MTTs are probably missing one of the cases >> that causes this problem. Hopefully we can get this fixed before 1.8.2. > as explained in #4610 (https://svn.open-mpi.org/trac/ompi/ticket/4610) > the root cause is in the way data are unpacked. > > The scif btl is ok :-) > > when using --mca btl scif,self fragments can be received out of order, > and that can trigger a bug introduced by r31496 > > that being said, --mca btl scif,vader,self does not work with r31496 > reverted. > the root cause is an other bug in the way data are unpacked, it happen > also when fragments are received out of order > *and* fragments contain a subpart of a predefined datatype. > in this case, the vader btl received a fragment of size 1325 *and* out > of order and that caused the bug. > > Gilles