Re: [OMPI users] stdin issue with openmpi/2.0.0

r...@open-mpi.org Mon, 22 Aug 2016 21:30:07 -0700

FWIW: I just tested forwarding up to 100MBytes via stdin using the simple test 
shown below with OMPI v2.0.1rc1, and it worked fine. So I’d suggest upgrading 
when the official release comes out, or going ahead and at least testing 
2.0.1rc1 on your machine. Or you can test this program with some input file and 
let me know if it works for you.


Ralph

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include <unistd.h>
#include <mpi.h>

#define ORTE_IOF_BASE_MSG_MAX   2048

int main(int argc, char *argv[])
{
    int i, rank, size, next, prev, tag = 201;
    int pos, msgsize, nbytes;
    bool done;
    char *msg;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    fprintf(stderr, "Rank %d has cleared MPI_Init\n", rank);

    next = (rank + 1) % size;
    prev = (rank + size - 1) % size;
    msg = malloc(ORTE_IOF_BASE_MSG_MAX);
    pos = 0;
    nbytes = 0;

    if (0 == rank) {
        while (0 != (msgsize = read(0, msg, ORTE_IOF_BASE_MSG_MAX))) {
            fprintf(stderr, "Rank %d: sending blob %d\n", rank, pos);
            if (msgsize > 0) {
                MPI_Bcast(msg, ORTE_IOF_BASE_MSG_MAX, MPI_BYTE, 0, 
MPI_COMM_WORLD);
            }
            ++pos;
            nbytes += msgsize;
        }
        fprintf(stderr, "Rank %d: sending termination blob %d\n", rank, pos);
        memset(msg, 0, ORTE_IOF_BASE_MSG_MAX);
        MPI_Bcast(msg, ORTE_IOF_BASE_MSG_MAX, MPI_BYTE, 0, MPI_COMM_WORLD);
        MPI_Barrier(MPI_COMM_WORLD);
    } else {
        while (1) {
            MPI_Bcast(msg, ORTE_IOF_BASE_MSG_MAX, MPI_BYTE, 0, MPI_COMM_WORLD);
            fprintf(stderr, "Rank %d: recvd blob %d\n", rank, pos);
            ++pos;
            done = true;
            for (i=0; i < ORTE_IOF_BASE_MSG_MAX; i++) {
                if (0 != msg[i]) {
                    done = false;
                    break;
                }
            }
            if (done) {
                break;
            }
        }
        fprintf(stderr, "Rank %d: recv done\n", rank);
        MPI_Barrier(MPI_COMM_WORLD);
    }

    fprintf(stderr, "Rank %d has completed bcast\n", rank);
    MPI_Finalize();
    return 0;
}



> On Aug 22, 2016, at 3:40 PM, Jingchao Zhang <zh...@unl.edu> wrote:
> 
> This might be a thin argument but we have many users running mpirun in this 
> way for years with no problem until this recent upgrade. And some home-brewed 
> mpi codes do not even have a standard way to read the input files. Last time 
> I checked, the openmpi manual still claims it supports stdin 
> (https://www.open-mpi.org/doc/v2.0/man1/mpirun.1.php#sect14 
> <https://www.open-mpi.org/doc/v2.0/man1/mpirun.1.php#sect14>). Maybe I missed 
> it but the v2.0 release notes did not mention any changes to the behaviors of 
> stdin as well.
> 
> We can tell our users to run mpirun in the suggested way, but I do hope 
> someone can look into the issue and fix it.
> 
> Dr. Jingchao Zhang
> Holland Computing Center
> University of Nebraska-Lincoln
> 402-472-6400
> From: users <users-boun...@lists.open-mpi.org 
> <mailto:users-boun...@lists.open-mpi.org>> on behalf of r...@open-mpi.org 
> <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>>
> Sent: Monday, August 22, 2016 3:04:50 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0
>  
> Well, I can try to find time to take a look. However, I will reiterate what 
> Jeff H said - it is very unwise to rely on IO forwarding. Much better to just 
> directly read the file unless that file is simply unavailable on the node 
> where rank=0 is running.
> 
>> On Aug 22, 2016, at 1:55 PM, Jingchao Zhang <zh...@unl.edu 
>> <mailto:zh...@unl.edu>> wrote:
>> 
>> Here you can find the source code for lammps input 
>> https://github.com/lammps/lammps/blob/r13864/src/input.cpp 
>> <https://github.com/lammps/lammps/blob/r13864/src/input.cpp>
>> Based on the gdb output, rank 0 stuck at line 167
>> if
>>  (fgets(&line[m],maxline-m,infile)
>>  == NULL)
>> and the rest threads stuck at line 203
>> MPI_Bcast(&n,1,MPI_INT,0,world);
>> 
>> So rank 0 possibly hangs on the fgets() function.
>> 
>> Here are the whole backtrace information:
>> $ cat master.backtrace worker.backtrace
>> #0  0x0000003c37cdb68d in read () from /lib64/libc.so.6
>> #1  0x0000003c37c71ca8 in _IO_new_file_underflow () from /lib64/libc.so.6
>> #2  0x0000003c37c737ae in _IO_default_uflow_internal () from /lib64/libc.so.6
>> #3  0x0000003c37c67e8a in _IO_getline_info_internal () from /lib64/libc.so.6
>> #4  0x0000003c37c66ce9 in fgets () from /lib64/libc.so.6
>> #5  0x00000000005c5a43 in LAMMPS_NS::Input::file() () at ../input.cpp:167
>> #6  0x00000000005d4236 in main () at ../main.cpp:31
>> #0  0x00002b1635d2ace2 in poll_dispatch () from 
>> /util/opt/openmpi/2.0.0/gcc/6.1.0/lib/libopen-pal.so.20
>> #1  0x00002b1635d1fa71 in opal_libevent2022_event_base_loop ()
>>    from /util/opt/openmpi/2.0.0/gcc/6.1.0/lib/libopen-pal.so.20
>> #2  0x00002b1635ce4634 in opal_progress () from 
>> /util/opt/openmpi/2.0.0/gcc/6.1.0/lib/libopen-pal.so.20
>> #3  0x00002b16351b8fad in ompi_request_default_wait () from 
>> /util/opt/openmpi/2.0.0/gcc/6.1.0/lib/libmpi.so.20
>> #4  0x00002b16351fcb40 in ompi_coll_base_bcast_intra_generic ()
>>    from /util/opt/openmpi/2.0.0/gcc/6.1.0/lib/libmpi.so.20
>> #5  0x00002b16351fd0c2 in ompi_coll_base_bcast_intra_binomial ()
>>    from /util/opt/openmpi/2.0.0/gcc/6.1.0/lib/libmpi.so.20
>> #6  0x00002b1644fa6d9b in ompi_coll_tuned_bcast_intra_dec_fixed ()
>>    from /util/opt/openmpi/2.0.0/gcc/6.1.0/lib/openmpi/mca_coll_tuned.so
>> #7  0x00002b16351cb4fb in PMPI_Bcast () from 
>> /util/opt/openmpi/2.0.0/gcc/6.1.0/lib/libmpi.so.20
>> #8  0x00000000005c5b5d in LAMMPS_NS::Input::file() () at ../input.cpp:203
>> #9  0x00000000005d4236 in main () at ../main.cpp:31
>> 
>> Thanks,
>> 
>> Dr. Jingchao Zhang
>> Holland Computing Center
>> University of Nebraska-Lincoln
>> 402-472-6400
>> From: users <users-boun...@lists.open-mpi.org 
>> <mailto:users-boun...@lists.open-mpi.org>> on behalf of r...@open-mpi.org 
>> <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>>
>> Sent: Monday, August 22, 2016 2:17:10 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] stdin issue with openmpi/2.0.0
>>  
>> Hmmm...perhaps we can break this out a bit? The stdin will be going to your 
>> rank=0 proc. It sounds like you have some subsequent step that calls 
>> MPI_Bcast?
>> 
>> Can you first verify that the input is being correctly delivered to rank=0? 
>> This will help us isolate if the problem is in the IO forwarding, or in the 
>> subsequent Bcast.
>> 
>>> On Aug 22, 2016, at 1:11 PM, Jingchao Zhang <zh...@unl.edu 
>>> <mailto:zh...@unl.edu>> wrote:
>>> 
>>> Hi all,
>>> 
>>> We compiled openmpi/2.0.0 with gcc/6.1.0 and intel/13.1.3. Both of them 
>>> have odd behaviors when trying to read from standard input.
>>> 
>>> For example, if we start the application lammps across 4 nodes, each node 
>>> 16 cores, connected by Intel QDR Infiniband, mpirun works fine for the 1st 
>>> time, but always stuck in a few seconds thereafter.
>>> Command:
>>> mpirun ./lmp_ompi_g++ < in.snr
>>> in.snr is the Lammps input file. compiler is gcc/6.1.
>>> 
>>> Instead, if we use
>>> mpirun ./lmp_ompi_g++ -in in.snr
>>> it works 100%.
>>> 
>>> Some odd behaviors we gathered so far. 
>>> 1. For 1 node job, stdin always works.
>>> 2. For multiple nodes, stdin works unstably when the number of cores per 
>>> node are relatively small. For example, for 2/3/4 nodes, each node 8 cores, 
>>> mpirun works most of the time. But for each node with >8 cores, mpirun 
>>> works the 1st time, then always stuck. There seems to be a magic number 
>>> when it stops working.
>>> 3. We tested Quantum Expresso with compiler intel/13 and had the same 
>>> issue. 
>>> 
>>> We used gdb to debug and found when mpirun was stuck, the rest of the 
>>> processes were all waiting on mpi broadcast from the master thread. The 
>>> lammps binary, input file and gdb core files (example.tar.bz2) can be 
>>> downloaded from this link 
>>> https://drive.google.com/open?id=0B3Yj4QkZpI-dVWZtWmJ3ZXNVRGc 
>>> <https://drive.google.com/open?id=0B3Yj4QkZpI-dVWZtWmJ3ZXNVRGc>
>>> 
>>> Extra information:
>>> 1. Job scheduler is slurm.
>>> 2. configure setup:
>>> ./configure     --prefix=$PREFIX \
>>>                 --with-hwloc=internal \
>>>                 --enable-mpirun-prefix-by-default \
>>>                 --with-slurm \
>>>                 --with-verbs \
>>>                 --with-psm \
>>>                 --disable-openib-connectx-xrc \
>>>                 --with-knem=/opt/knem-1.1.2.90mlnx1 \
>>>                 --with-cma
>>> 3. openmpi-mca-params.conf file 
>>> orte_hetero_nodes=1
>>> hwloc_base_binding_policy=core
>>> rmaps_base_mapping_policy=core
>>> opal_cuda_support=0
>>> btl_openib_use_eager_rdma=0
>>> btl_openib_max_eager_rdma=0
>>> btl_openib_flags=1
>>> 
>>> Thanks,
>>> Jingchao 
>>> 
>>> Dr. Jingchao Zhang
>>> Holland Computing Center
>>> University of Nebraska-Lincoln
>>> 402-472-6400
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] stdin issue with openmpi/2.0.0

Reply via email to