Dear OpenMPI developers, i just created #4531 in order to track this issue : https://svn.open-mpi.org/trac/ompi/ticket/4531
Basically, the coll/tuned implementation of MPI_Bcast does not work when two tasks uses datatypes of different sizes. for example, if the root send two large vectors of MPI_INT and non root receive many MPI_INT, then MPI_Bcast will crash. but if the root send many MPI_INT and the non root receive two large vectors of MPI_INT, then MPI_Bcast will silently fail. (the TRAC ticket has attached test cases) i believe this kind of issue could occur on all/most collective of the coll/tuned module, so it is not limited to MPI_Bcast. i am wondering of what could be the best way to solve this. one solution i could think of, would be to generate temporary datatypes in order to send message whose size is exactly the segment_size. an other solution i could think of, would be to have new send/recv functions : if we consider the send function : int mca_pml_ob1_send(void *buf, size_t count, ompi_datatype_t * datatype, int dst, int tag, mca_pml_base_send_mode_t sendmode, ompi_communicator_t * comm) we could imagine to have the xsend function : int mca_pml_ob1_xsend(void *buf, size_t count, ompi_datatype_t * datatype, size_t offset, size_t size, int dst, int tag, mca_pml_base_send_mode_t sendmode, ompi_communicator_t * comm) where offset is the number of bytes that should be skipped from the beginning of buf and size if the (max) number of bytes to be sent (e.g. the message will be "truncated" to size bytes if (count*size(datatype) - offset) > size or we could use a buffer if needed, and send/recv with MPI_PACKED datatype (this is less efficient, would it even work on heterogeneous nodes ?) or we could simply consider this is just a limitation of coll/tuned (coll/basic works fine) and do nothing or something else i did not think of ... thanks in advance for your feedback Gilles