Re: [OMPI devel] Possible Bug / Invalid Read in Ialltoallw

2022-05-04 Thread Gilles Gouaillardet via devel
Thanks again Damian, I think the root cause is we call mca_topo_base_neighbor_count() instead of ompi_comm_size() here. It seems the implicit assumption is one would call MPI_Ineighbor_alltoallw() on a cartesian communicator ... which is obviously wrong: it is legit to call MPI_Ialltoallw(), even

Re: [OMPI devel] Possible Bug / Invalid Read in Ialltoallw

2022-05-04 Thread Damian Marek via devel
Hello, I made an example that triggers the issue. I had to get a little creative with how to trigger the crash, since it does not appear if the memory allocated for the send and recv types happens to be set to 0 (although valgrind still reports an invalid read). Communicator is an intra-commun

Re: [OMPI devel] Possible Bug / Invalid Read in Ialltoallw

2022-05-04 Thread George Bosilca via devel
Damien, As Gilles indicated an example would be great. Meanwhile, as you already have access to the root cause with a debugger, can you check what branch of the if regarding the communicator type in the ompi_coll_base_retain_datatypes_w function is taken. What is the communicator type ? Intra or i

Re: [OMPI devel] Possible Bug / Invalid Read in Ialltoallw

2022-05-04 Thread Gilles Gouaillardet via devel
Damian, Thanks for the report! could you please trim your program and share it so I can have a look? Cheers, Gilles On Wed, May 4, 2022 at 10:27 PM Damian Marek via devel < devel@lists.open-mpi.org> wrote: > Hello, > > I have been getting intermittent memory corruptions and segmentation > f

[OMPI devel] Possible Bug / Invalid Read in Ialltoallw

2022-05-04 Thread Damian Marek via devel
Hello, I have been getting intermittent memory corruptions and segmentation faults while using Ialltoallw in OpenMPI v4.0.3. Valgrind also reports an invalid read in the "ompi_coll_base_retain_datatypes_w" function defined in "coll_base_util.c". Running with a debug build of ompi an assertion

Re: [OMPI devel] What PMIx version(s) does v5.0.0 and main support?

2022-05-04 Thread Jeff Squyres (jsquyres) via devel
We discussed this on the OMPI call yesterday, but I am not knowledgeable of the consequences of only supporting building Open MPI main and v5.x against PMIx v4.x are (vs. also supporting building Open MPI main+v5.x against PMIx v3.2.x, or even PMIx v3.x). I think we might need to talk to the PM