Dear All,
Next feed back is about MPI_Gather problem.
Gather may be truncated in following condition:
1:ompi_coll_tuned_gather_intra_linear_sync is called.
(message size is over 6000B)
2:Either send data type or recv data type is derived type and
other data type is predefined data type.
Truncated is occurred by attached C file(following output).
Output:
*** An error occurred in MPI_Gather
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TRUNCATE: message truncated
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
In this C program,
"first_segment_count(variable in ompi_coll_tuned_gather_intra_linear_sync)" is
different between root and non-root.
That makes messages truncated.
"first_segment_size" can not be dividable by derived data type's size,
but can dividable by predefined data type's size.
But we don't solve this problem.
So, we don't choose linear_sync in coll_tuned_decision_fixed.c.
Best Regards,
Yuki MATSUMOTO
MPI development team,
Fujitsu
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "mpi.h"
int main (int argc, char **argv)
{
int sbuf[30000];
int rbuf[30000];
int myproc, nprocs;
MPI_Datatype * itype;
int n = 751;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &myproc);
if ( 0 == myproc)
{
printf ("msg size:%lu\n",2*n*sizeof(int));
}
MPI_Type_vector(2,n,2*n, MPI_INT,itype);
MPI_Type_commit(itype);
memset((void *)sbuf, myproc+1 , sizeof(int)*n);
MPI_Gather(sbuf, 2*n, MPI_INT, rbuf,1,*itype, 0, MPI_COMM_WORLD);
MPI_Finalize();
return 0;
}