[OMPI users] error while running mpirun
Dear Open MPI users, I am using the following system CentOS release 6.6 Rocks 6.2 I have been trying to install openmpi-3.1.3. After installing it, in the example folder if want to test run I got the following error I run command mpirun -np 1 hello_c [user@chimera examples]$ mpirun -np 1 hello_c libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 -- As of version 3.0.0, the "sm" BTL is no longer available in Open MPI. Efficient, high-speed same-node shared memory communication support in Open MPI is available in the "vader" BTL. To use the vader BTL, you can re-run your job with: mpirun --mca btl vader,self,... your_mpi_application -- -- A requested component was not found, or was unable to be opened. This means that this component is either not installed or is unable to be used on your system (e.g., sometimes this means that shared libraries that the component requires are unable to be found/loaded). Note that Open MPI stopped checking at the first component that it did not find. Host: chimera.pnc.res.in Framework: btl Component: sm -- -- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): mca_bml_base_open() failed --> Returned "Not found" (-13) instead of "Success" (0) -- [chimera:04271] *** An error occurred in MPI_Init [chimera:04271] *** reported by process [1310326785,0] [chimera:04271] *** on a NULL communicator [chimera:04271] *** Unknown error [chimera:04271] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [chimera:04271] ***and potentially your MPI job) I am quite new to it, so please inform me if you need additional information about the system. Any advice is welcomed. Srijan -- SRIJAN CHATTERJEE
Re: [OMPI users] Any reason why I can't start an mpirun job from within an mpi process?
Doing the fork/exec with an environment that omitted the variables whose names started with either 'OMPI' or 'PMIX' did the trick. On Sat, Jul 11, 2020 at 6:04 PM John Retterer wrote: > Thank you, Ralph. I was afraid that was the reason it didn’t work but I > wanted to be sure. > >John > > > On Sat, Jul 11, 2020 at 5:37 PM Ralph Castain via users < > users@lists.open-mpi.org> wrote: > >> You cannot cascade mpirun cmds like that - the child mpirun picks up >> envars that causes it to break. You'd have to either use comm_spawn to >> start the child job, or do a fork/exec where you can set the environment to >> be some pristine set of values. >> >> >> > On Jul 11, 2020, at 1:12 PM, John Retterer via users < >> users@lists.open-mpi.org> wrote: >> > >> > From the rank #0 process, I wish to start another mpi job (to create >> some data to be used by the parent job). >> > I'm trying to do this with the command >> > >> >istat= system( "/...path.../mpirun -np 2 /...path.../prog2 args >> >file.log 2>&1" ) >> > >> > within the code executed by the rank #0 process, where ...path... is >> the path to the executables, prog2 is the daughter program, and args its >> arguments. >> > >> > On return, the status istat = 256, and, although the log file file.log >> has been created, it is empty. >> > >> > DOes anybody have an idea why this doesn't work? >> >> >>
Re: [OMPI users] error while running mpirun
Srijan, The logs suggest you explicitly request the btl/sm component, and this typically occurs via a openmpi-mca-params.conf (that contains a line such as btl = sm,openib,self), or the OMPI_MCA_btl environment variable Cheers, Gilles On Mon, Jul 13, 2020 at 1:50 AM Srijan Chatterjee via users < users@lists.open-mpi.org> wrote: > Dear Open MPI users, > I am using the following system > CentOS release 6.6 > Rocks 6.2 > I have been trying to install openmpi-3.1.3. > After installing it, in the example folder if want to test run I got the > following error > > I run command mpirun -np 1 hello_c > > [user@chimera examples]$ mpirun -np 1 hello_c > libibverbs: Warning: no userspace device-specific driver found for > /sys/class/infiniband_verbs/uverbs0 > libibverbs: Warning: no userspace device-specific driver found for > /sys/class/infiniband_verbs/uverbs0 > -- > As of version 3.0.0, the "sm" BTL is no longer available in Open MPI. > > Efficient, high-speed same-node shared memory communication support in > Open MPI is available in the "vader" BTL. To use the vader BTL, you > can re-run your job with: > > mpirun --mca btl vader,self,... your_mpi_application > -- > -- > A requested component was not found, or was unable to be opened. This > means that this component is either not installed or is unable to be > used on your system (e.g., sometimes this means that shared libraries > that the component requires are unable to be found/loaded). Note that > Open MPI stopped checking at the first component that it did not find. > > Host: chimera.pnc.res.in > Framework: btl > Component: sm > -- > -- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > mca_bml_base_open() failed > --> Returned "Not found" (-13) instead of "Success" (0) > -- > [chimera:04271] *** An error occurred in MPI_Init > [chimera:04271] *** reported by process [1310326785,0] > [chimera:04271] *** on a NULL communicator > [chimera:04271] *** Unknown error > [chimera:04271] *** MPI_ERRORS_ARE_FATAL (processes in this communicator > will now abort, > [chimera:04271] ***and potentially your MPI job) > > > > I am quite new to it, so please inform me if you need additional > information about the system. > Any advice is welcomed. > > Srijan > > > > > > > -- > SRIJAN CHATTERJEE >