Hello devel list 

I've been trying to use a non-blocking MPI_Iallreduce in a CFD application I'm 
working on, but it kept segfaulting on me. I have reduced it to a simple test 
case - see the gist here for the full code
        https://gist.github.com/rupertnash/11222282
build and run with:
        mpicc test.c -o test && mpirun -n 2 ./test

I am working on OS X Mavericks with open-mpi 1.8 built from the source tarball. 

Through some debugging I have narrowed the problem down:
In ompi/mca/coll/libnbc/nbc.c, in NBC_Start_round, where the code switches on 
which type of operation has been put in the schedule:

      case OP:
        NBC_DEBUG(5, "  OP   (offset %li) ", (long)ptr-(long)myschedule);
        NBC_GET_BYTES(ptr,opargs);
        NBC_DEBUG(5, "*buf1: %p, buf2: %p, count: %i, type: %lu)\n", 
opargs.buf1, opargs.buf2, opargs.count, (unsigned long)opargs.datatype);
        /* get buffers */
        /* SNIP */
--->    ompi_3buff_op_reduce(opargs.op, buf1, buf2, buf3, opargs.count, 
opargs.datatype);
        break;

The line marked with an arrow --> is the problem. Looking at the comments 
describing ompi_3buff_op_reduce, it states "This function will *only* be 
invoked on intrinsic MPI_Ops." Examining the code bears this out as it's 
clearly indexing into a table of function pointers, which are all null for a 
user-defined MPI_Op.

Presumably the fix will be to replace the use of the 3buffer version with the 
usual ompi_op_reduce, at least of non-intrinsic operations. I have made a 
temporary patch by replacing the arrowed line with the following:
        if (0 != (opargs.op->o_flags & OMPI_OP_FLAGS_INTRINSIC)) {
          ompi_3buff_op_reduce(opargs.op, buf1, buf2, buf3, opargs.count, 
opargs.datatype);
        } else {
          ompi_op_reduce(opargs.op, buf1, buf3, opargs.count, opargs.datatype);
          ompi_op_reduce(opargs.op, buf2, buf3, opargs.count, opargs.datatype);
        }
However this is the first time I've looked under the hood of OpenMPI. Hopefully 
you can patch it properly soon.

Best wishes,

Rupert
-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Reply via email to