Hi all

I have reported this issue before, but then had brushed it off as something
that was caused by my modifications to the source tree. It looks like that
is not the case.

Just now, I did the following:

1. Cloned a fresh copy from master.
2. Configured with the following flags, built and installed it in my
two-node "cluster".
--enable-debug --enable-debug-symbols --disable-dlopen
3. Compiled the following program, mpitest.c with these flags: -g3 -Wall
-Wextra
4. Ran it like this:
[durga@smallMPI ~]$ mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp
-mca pml ob1 ./mpitest

With this, the code hangs at MPI_Barrier() on both nodes, after generating
the following output:

Hello world from processor smallMPI, rank 0 out of 2 processors
Hello world from processor bigMPI, rank 1 out of 2 processors
smallMPI sent haha!
bigMPI received haha!
<Hangs until killed by ^C>
Attaching to the hung process at one node gives the following backtrace:

(gdb) bt
#0  0x00007f55b0f41c3d in poll () from /lib64/libc.so.6
#1  0x00007f55b03ccde6 in poll_dispatch (base=0x70e7b0, tv=0x7ffd1bb551c0)
at poll.c:165
#2  0x00007f55b03c4a90 in opal_libevent2022_event_base_loop (base=0x70e7b0,
flags=2) at event.c:1630
#3  0x00007f55b02f0144 in opal_progress () at runtime/opal_progress.c:171
#4  0x00007f55b14b4d8b in opal_condition_wait (c=0x7f55b19fec40
<ompi_request_cond>, m=0x7f55b19febc0 <ompi_request_lock>) at
../opal/threads/condition.h:76
#5  0x00007f55b14b531b in ompi_request_default_wait_all (count=2,
requests=0x7ffd1bb55370, statuses=0x7ffd1bb55340) at request/req_wait.c:287
#6  0x00007f55b157a225 in ompi_coll_base_sendrecv_zero (dest=1, stag=-16,
source=1, rtag=-16, comm=0x601280 <ompi_mpi_comm_world>)
    at base/coll_base_barrier.c:63
#7  0x00007f55b157a92a in ompi_coll_base_barrier_intra_two_procs
(comm=0x601280 <ompi_mpi_comm_world>, module=0x7c2630) at
base/coll_base_barrier.c:308
#8  0x00007f55b15aafec in ompi_coll_tuned_barrier_intra_dec_fixed
(comm=0x601280 <ompi_mpi_comm_world>, module=0x7c2630) at
coll_tuned_decision_fixed.c:196
#9  0x00007f55b14d36fd in PMPI_Barrier (comm=0x601280
<ompi_mpi_comm_world>) at pbarrier.c:63
#10 0x0000000000400b0b in main (argc=1, argv=0x7ffd1bb55658) at mpitest.c:26
(gdb)

Thinking that this might be a bug in tuned collectives, since that is what
the stack shows, I ran the program like this (basically adding the ^tuned
part)

[durga@smallMPI ~]$ mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp
-mca pml ob1 -mca coll ^tuned ./mpitest

It still hangs, but now with a different stack trace:
(gdb) bt
#0  0x00007f910d38ac3d in poll () from /lib64/libc.so.6
#1  0x00007f910c815de6 in poll_dispatch (base=0x1a317b0, tv=0x7fff43ee3610)
at poll.c:165
#2  0x00007f910c80da90 in opal_libevent2022_event_base_loop
(base=0x1a317b0, flags=2) at event.c:1630
#3  0x00007f910c739144 in opal_progress () at runtime/opal_progress.c:171
#4  0x00007f910db130f7 in opal_condition_wait (c=0x7f910de47c40
<ompi_request_cond>, m=0x7f910de47bc0 <ompi_request_lock>)
    at ../../../../opal/threads/condition.h:76
#5  0x00007f910db132d8 in ompi_request_wait_completion (req=0x1b07680) at
../../../../ompi/request/request.h:383
#6  0x00007f910db1533b in mca_pml_ob1_send (buf=0x0, count=0,
datatype=0x7f910de1e340 <ompi_mpi_byte>, dst=1, tag=-16,
sendmode=MCA_PML_BASE_SEND_STANDARD,
    comm=0x601280 <ompi_mpi_comm_world>) at pml_ob1_isend.c:259
#7  0x00007f910d9c3b38 in ompi_coll_base_barrier_intra_basic_linear
(comm=0x601280 <ompi_mpi_comm_world>, module=0x1b092c0) at
base/coll_base_barrier.c:368
#8  0x00007f910d91c6fd in PMPI_Barrier (comm=0x601280
<ompi_mpi_comm_world>) at pbarrier.c:63
#9  0x0000000000400b0b in main (argc=1, argv=0x7fff43ee3a58) at mpitest.c:26
(gdb)

The mpitest.c program is as follows:
#include <mpi.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char** argv)
{
    int world_size, world_rank, name_len;
    char hostname[MPI_MAX_PROCESSOR_NAME], buf[8];

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    MPI_Get_processor_name(hostname, &name_len);
    printf("Hello world from processor %s, rank %d out of %d processors\n",
hostname, world_rank, world_size);
    if (world_rank == 1)
    {
    MPI_Recv(buf, 6, MPI_CHAR, 0, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    printf("%s received %s\n", hostname, buf);
    }
    else
    {
    strcpy(buf, "haha!");
    MPI_Send(buf, 6, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
    printf("%s sent %s\n", hostname, buf);
    }
    MPI_Barrier(MPI_COMM_WORLD);
    MPI_Finalize();
    return 0;
}

The hostfile is as follows:
10.10.10.10 slots=1
10.10.10.11 slots=1

The two nodes are connected by three physical and 3 logical networks:
Physical: Gigabit Ethernet, 10G iWARP, 20G Infiniband
Logical: IP (all 3), PSM (Qlogic Infiniband), Verbs (iWARP and Infiniband)

Please note again that this is a fresh, brand new clone.

Is this a bug (perhaps a side effect of --disable-dlopen) or something I am
doing wrong?

Thanks
Durga

We learn from history that we never learn from history.

Reply via email to