Damian, Thanks for the report!
could you please trim your program and share it so I can have a look? Cheers, Gilles On Wed, May 4, 2022 at 10:27 PM Damian Marek via devel < devel@lists.open-mpi.org> wrote: > Hello, > > I have been getting intermittent memory corruptions and segmentation > faults while using Ialltoallw in OpenMPI v4.0.3. Valgrind also reports an > invalid read in the "ompi_coll_base_retain_datatypes_w" function defined in > "coll_base_util.c". > > Running with a debug build of ompi an assertion fails as well: > > base/coll_base_util.c:274: ompi_coll_base_retain_datatypes_w: Assertion > `OPAL_OBJ_MAGIC_ID == ((opal_object_t *) (stypes[i]))->obj_magic_id' failed. > > I think it is related to the fact that I am using a communicator created > with 2D MPI_Cart_create followed by getting 2 subcommunicators from > MPI_Cart_sub, in some cases one of the dimensions is 1. In > "ompi_coll_base_retain_datatypes_w" the neighbour count is used to find > "rcount" and "scount" at line 267. In my bug case it returns 2 for both, > but I believe it should be 1 since that is the comm size and the amount of > memory I have allocated for sendtypes and recvtypes. Then, an invalid read > happens at 274 and 280. > > Regards, > Damian > > > > > > >