Jonas, Assuming v_glob is what you expect, you will need to `MPI_Type_create_resized_type()` the received type so the block received from process 1 will be placed at the right position (v_glob[3][1] => upper bound = ((4*3+1) * sizeof(int))
Cheers, Gilles On Thu, Dec 16, 2021 at 6:33 PM Jonas Thies via users < users@lists.open-mpi.org> wrote: > Dear OpenMPI community, > > Here's a little puzzle for the Christmas holidays (although I would really > appreciate a quick solution!). > > I'm stuck with the following relatively basic problem: given a local nloc > x m matrix X_p in column-major ordering on each MPI process p, perform a > single MPI_Gather operation to construct the matrix > X_0 > X_1 > ... > > X_nproc > > again, in col-major ordering. My approach is to use MPI_Type_vector to > define an stype and an rtype, where stype has stride nloc, and rtype has > stride nproc*nloc. The observation is that there is an unexpected > displacement of (m-1)*n*p in the result array for the part arriving from > process p. > > The MFE code is attached, and I use OpenMPI 4.0.5 with GCC 11.2 (although > other versions and even distributions seem to display the same behavior). > Example (nloc=3, nproc=3, m=2, with some additional columns printed for the > sake of demonstration): > > > > mpicxx -o matrix_gather matrix_gather.cpp > mpirun -np 3 ./matrix_gather > > v_loc on P0: 3x2 > 0 9 > 1 10 > 2 11 > > v_loc on P1: 3x2 > 3 12 > 4 13 > 5 14 > > v_loc on P2: 3x2 > 6 15 > 7 16 > 8 17 > > v_glob on P0: 9x4 > 0 9 0 0 > 1 10 0 0 > 2 11 0 0 > 0 3 12 0 > 0 4 13 0 > 0 5 14 0 > 0 0 6 15 > 0 0 7 16 > 0 0 8 17 > > Any ideas? > > Thanks, > > Jonas > > > -- > *J. Thies* > Assistant Professor > > TU Delft > Faculty Electrical Engineering, Mathematics and Computer Science > Institute of Applied Mathematics and High Performance Computing Center > Mekelweg 4 > 2628 CD Delft > > T +31 15 27 XXXX > *j.th...@tudelft.nl <j.th...@tudelft.nl>* >