Thanks George, Pierre
> On 13 Mar 2021, at 22:24, George Bosilca <bosi...@icl.utk.edu> wrote: > > Hi Pierre, > > MPI is allowed to pipeline the collective communications. This explains why > the MPI_Op takes the len of the buffers as an argument. Because your MPI_Op > ignores this length it alters data outside the temporary buffer we use for > the segment. Other versions of the MPI_Allreduce implementation might choose > not to pipeline in which case applying the MPI_Op on the entire length of the > buffer (as you manually did in your code) is correct. > > George. > > > On Sat, Mar 13, 2021 at 4:47 AM Pierre Jolivet via users > <users@lists.open-mpi.org> wrote: >> Hello, >> The following piece of code generates Valgrind errors with OpenMPI 4.1.0, >> while it is Valgrind-clean with MPICH and OpenMPI 4.0.5. >> I don’t think I’m doing anything illegal, so could this be a regression >> introduced in 4.1.0? >> >> Thanks, >> Pierre >> >> $ /opt/openmpi-4.1.0/bin/mpicxx ompi.cxx -g -O0 -std=c++11 >> $ /opt/openmpi-4.1.0/bin/mpirun -n 4 valgrind --log-file=dump.%p.log ./a.out >> >> >> >> ==528== Invalid read of size 2 >> ==528== at 0x4011EB: main::{lambda(void*, void*, int*, >> ompi_datatype_t**)#1}::operator()(void*, void*, int*, ompi_datatype_t**) >> const (ompi.cxx:15) >> ==528== by 0x40127B: main::{lambda(void*, void*, int*, >> ompi_datatype_t**)#1}::_FUN(void*, void*, int*, ompi_datatype_t**) >> (ompi.cxx:19) >> ==528== by 0x48EFFED: ompi_coll_base_allreduce_intra_ring (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x77CD93A: ompi_coll_tuned_allreduce_intra_dec_fixed (in >> /opt/openmpi-4.1.0/lib/openmpi/mca_coll_tuned.so) >> ==528== by 0x48AAF00: PMPI_Allreduce (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x401317: main (ompi.cxx:21) >> ==528== Address 0x7139f74 is 0 bytes after a block of size 4 alloc'd >> ==528== at 0x4839809: malloc (vg_replace_malloc.c:307) >> ==528== by 0x48EF940: ompi_coll_base_allreduce_intra_ring (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x77CD93A: ompi_coll_tuned_allreduce_intra_dec_fixed (in >> /opt/openmpi-4.1.0/lib/openmpi/mca_coll_tuned.so) >> ==528== by 0x48AAF00: PMPI_Allreduce (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x401317: main (ompi.cxx:21) >> ==528== >> ==528== Invalid read of size 2 >> ==528== at 0x40120E: main::{lambda(void*, void*, int*, >> ompi_datatype_t**)#1}::operator()(void*, void*, int*, ompi_datatype_t**) >> const (ompi.cxx:16) >> ==528== by 0x40127B: main::{lambda(void*, void*, int*, >> ompi_datatype_t**)#1}::_FUN(void*, void*, int*, ompi_datatype_t**) >> (ompi.cxx:19) >> ==528== by 0x48EFFED: ompi_coll_base_allreduce_intra_ring (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x77CD93A: ompi_coll_tuned_allreduce_intra_dec_fixed (in >> /opt/openmpi-4.1.0/lib/openmpi/mca_coll_tuned.so) >> ==528== by 0x48AAF00: PMPI_Allreduce (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x401317: main (ompi.cxx:21) >> ==528== Address 0x7139f76 is 2 bytes after a block of size 4 alloc'd >> ==528== at 0x4839809: malloc (vg_replace_malloc.c:307) >> ==528== by 0x48EF940: ompi_coll_base_allreduce_intra_ring (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x77CD93A: ompi_coll_tuned_allreduce_intra_dec_fixed (in >> /opt/openmpi-4.1.0/lib/openmpi/mca_coll_tuned.so) >> ==528== by 0x48AAF00: PMPI_Allreduce (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x401317: main (ompi.cxx:21) >> ==528== >> ==528== Invalid read of size 2 >> ==528== at 0x401231: main::{lambda(void*, void*, int*, >> ompi_datatype_t**)#1}::operator()(void*, void*, int*, ompi_datatype_t**) >> const (ompi.cxx:18) >> ==528== by 0x40127B: main::{lambda(void*, void*, int*, >> ompi_datatype_t**)#1}::_FUN(void*, void*, int*, ompi_datatype_t**) >> (ompi.cxx:19) >> ==528== by 0x48EFFED: ompi_coll_base_allreduce_intra_ring (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x77CD93A: ompi_coll_tuned_allreduce_intra_dec_fixed (in >> /opt/openmpi-4.1.0/lib/openmpi/mca_coll_tuned.so) >> ==528== by 0x48AAF00: PMPI_Allreduce (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x401317: main (ompi.cxx:21) >> ==528== Address 0x7139f78 is 4 bytes after a block of size 4 alloc'd >> ==528== at 0x4839809: malloc (vg_replace_malloc.c:307) >> ==528== by 0x48EF940: ompi_coll_base_allreduce_intra_ring (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x77CD93A: ompi_coll_tuned_allreduce_intra_dec_fixed (in >> /opt/openmpi-4.1.0/lib/openmpi/mca_coll_tuned.so) >> ==528== by 0x48AAF00: PMPI_Allreduce (in >> /opt/openmpi-4.1.0/lib/libmpi.so.40.30.0) >> ==528== by 0x401317: main (ompi.cxx:21)