Hi,
after following the instructions of the error message, in other works
running like this:

#!/bin/bash
#PBS -l nodes=1:ppn=32
#PBS -N mc_cond_0_h3
#PBS -o mc_cond_0_h3.o
#PBS -e mc_cond_0_h3.e

PATH=$HOME/libraries/compiled_with_gcc-7.3.0/openmpi-4.0.2/bin:$PATH
LD_LIBRARY_PATH=/share/apps/gcc-7.3.0/lib64:$LD_LIBRARY_PATH
cd $PBS_O_WORKDIR
mpirun --mca btl vader,self -np 32 ./flash4

I get the following error messages:

[mpiexec@compute-0-34.local] match_arg (utils/args/args.c:159):
unrecognized argument mca
[mpiexec@compute-0-34.local] HYDU_parse_array (utils/args/args.c:174):
argument matching returned error
[mpiexec@compute-0-34.local] parse_args (ui/mpich/utils.c:1596): error
parsing input array
[mpiexec@compute-0-34.local] HYD_uii_mpx_get_parameters
(ui/mpich/utils.c:1648): unable to parse user arguments
[mpiexec@compute-0-34.local] main (ui/mpich/mpiexec.c:149): error parsing
parameters

Am I running it incorrectly ?
Cheers,

El mar., 10 dic. 2019 a las 15:40, Guido granda muñoz (<
guidogra...@gmail.com>) escribió:

> Hello,
> I compiled the application now using  openmpi-4.0.2:
>
>  linux-vdso.so.1 =>  (0x00007fffb23ff000)
> libhdf5.so.103 =>
> /home/guido/libraries/compiled_with_gcc-7.3.0/hdf5-1.10.5_serial/lib/libhdf5.so.103
> (0x00002b3cd188c000)
> libz.so.1 => /lib64/libz.so.1 (0x00002b3cd1e74000)
> libmpi_usempif08.so.40 =>
> /home/guido/libraries/compiled_with_gcc-7.3.0/openmpi-4.0.2/lib/libmpi_usempif08.so.40
> (0x00002b3cd208a000)
> libmpi_usempi_ignore_tkr.so.40 =>
> /home/guido/libraries/compiled_with_gcc-7.3.0/openmpi-4.0.2/lib/libmpi_usempi_ignore_tkr.so.40
> (0x00002b3cd22c0000)
> libmpi_mpifh.so.40 =>
> /home/guido/libraries/compiled_with_gcc-7.3.0/openmpi-4.0.2/lib/libmpi_mpifh.so.40
> (0x00002b3cd24c7000)
> libmpi.so.40 =>
> /home/guido/libraries/compiled_with_gcc-7.3.0/openmpi-4.0.2/lib/libmpi.so.40
> (0x00002b3cd2723000)
> libgfortran.so.4 => /share/apps/gcc-7.3.0/lib64/libgfortran.so.4
> (0x00002b3cd2a55000)
> libm.so.6 => /lib64/libm.so.6 (0x00002b3cd2dc3000)
> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b3cd3047000)
> libquadmath.so.0 => /share/apps/gcc-5.4.0/lib64/libquadmath.so.0
> (0x00002b3cd325e000)
> libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b3cd349c000)
> libc.so.6 => /lib64/libc.so.6 (0x00002b3cd36b9000)
> librt.so.1 => /lib64/librt.so.1 (0x00002b3cd3a4e000)
> libdl.so.2 => /lib64/libdl.so.2 (0x00002b3cd3c56000)
> libopen-rte.so.40 =>
> /home/guido/libraries/compiled_with_gcc-7.3.0/openmpi-4.0.2/lib/libopen-rte.so.40
> (0x00002b3cd3e5b000)
> libopen-pal.so.40 =>
> /home/guido/libraries/compiled_with_gcc-7.3.0/openmpi-4.0.2/lib/libopen-pal.so.40
> (0x00002b3cd4110000)
> libudev.so.0 => /lib64/libudev.so.0 (0x00002b3cd4425000)
> libutil.so.1 => /lib64/libutil.so.1 (0x00002b3cd4634000)
> /lib64/ld-linux-x86-64.so.2 (0x00002b3cd166a000)
>
> and ran it like this:
>
> #!/bin/bash
> #PBS -l nodes=1:ppn=32
> #PBS -N mc_cond_0_h3
> #PBS -o mc_cond_0_h3.o
> #PBS -e mc_cond_0_h3.e
>
> PATH=$HOME/libraries/compiled_with_gcc-7.3.0/openmpi-4.0.2/bin:$PATH
> LD_LIBRARY_PATH=/share/apps/gcc-7.3.0/lib64:$LD_LIBRARY_PATH
> cd $PBS_O_WORKDIR
> mpirun -np 32 ./flash4
>
> and now I'm getting this error messages:
>
> --------------------------------------------------------------------------
>
> As of version 3.0.0, the "sm" BTL is no longer available in Open MPI.
>
>
> Efficient, high-speed same-node shared memory communication support in
>
> Open MPI is available in the "vader" BTL. To use the vader BTL, you
>
> can re-run your job with:
>
>
> mpirun --mca btl vader,self,... your_mpi_application
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> A requested component was not found, or was unable to be opened. This
>
> means that this component is either not installed or is unable to be
>
> used on your system (e.g., sometimes this means that shared libraries
>
> that the component requires are unable to be found/loaded). Note that
>
> Open MPI stopped checking at the first component that it did not find.
>
>
> Host: compute-0-34.local
>
> Framework: btl
>
> Component: sm
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> It looks like MPI_INIT failed for some reason; your parallel process is
>
> likely to abort. There are many reasons that a parallel process can
>
> fail during MPI_INIT; some of which are due to configuration or environment
>
> problems. This failure appears to be an internal failure; here's some
>
> additional information (which may only be relevant to an Open MPI
>
> developer):
>
>
> mca_bml_base_open() failed
>
> --> Returned "Not found" (-13) instead of "Success" (0)
>
> --------------------------------------------------------------------------
>
> [compute-0-34:16915] *** An error occurred in MPI_Init
>
> [compute-0-34:16915] *** reported by process [3776708609,5]
>
> [compute-0-34:16915] *** on a NULL communicator
>
> [compute-0-34:16915] *** Unknown error
>
> [compute-0-34:16915] *** MPI_ERRORS_ARE_FATAL (processes in this
> communicator will now abort,
>
> [compute-0-34:16915] *** and potentially your MPI job)
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] PMIX ERROR: UNREACHABLE in file
> server/pmix_server.c at line 2147
>
> [compute-0-34.local:16902] 31 more processes have sent help message
> help-mpi-btl-sm.txt / btl sm is dead
>
> [compute-0-34.local:16902] Set MCA parameter "orte_base_help_aggregate" to
> 0 to see all help / error messages
>
> [compute-0-34.local:16902] 31 more processes have sent help message
> help-mca-base.txt / find-available:not-valid
>
> [compute-0-34.local:16902] 31 more processes have sent help message
> help-mpi-runtime.txt / mpi_init:startup:internal-failure
>
> [compute-0-34.local:16902] 31 more processes have sent help message
> help-mpi-errors.txt / mpi_errors_are_fatal unknown handle
>
> /var/spool/torque/mom_priv/jobs/4110.mouruka.crya.privado.SC: line 11:
> /home/guido: is a directory
>
>
> do you know what could cause this error?
>
>
> Thank you,
>
>
>
> El vie., 6 dic. 2019 a las 12:13, Jeff Squyres (jsquyres) (<
> jsquy...@cisco.com>) escribió:
>
>> On Dec 6, 2019, at 1:03 PM, Jeff Squyres (jsquyres) via users <
>> users@lists.open-mpi.org> wrote:
>> >
>> >> I get the same error when running in a single node. I will try to use
>> the last version. Is there way to check if different versions of open mpi
>> were used in different nodes?
>> >
>> > mpirun -np 2 ompi_info | head
>> >
>> > Or something like that.  With 1.10, I don't know/remember the mpirun
>> CLI option to make one process per node (when ppn>1); you may have to check
>> that.  Or just "mpirun -np 33 ompi_info | head" and examine the output
>> carefully to find the 33rd output and see if it's different.
>>
>> Poor quoting on my part.  The intent was to see just the first few lines
>> from running `ompi_info` on each node.
>>
>> So maybe something like:
>>
>> ------
>> $ cat foo.sh
>> #!/bin/sh
>> ompi_info | head
>> $ mpirun -np 2 foo.sh
>> ------
>>
>> Or "mprun -np 33 foo.sh", ....etc.
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>>
>>
>
> --
> Guido
>


-- 
Guido

Reply via email to