Good afternoon,

I'm getting an error message I'm not sure how to use to debug an issue.
I'll try to give you all of the pertinent about the setup, but I didn't
build the system nor install the software. It's an NVIDIA SuperPod system
with Base Command Manager 10.0.

I'm building IOR but I'm really interested in mdtest. "module list" says
I'm using the following modules:

gcc/64/4.1.5a1
ucx/1.10.1
openmpi4/gcc/4.1.5

There are no problems building the code.

I'm using Slurm to run mdtest using a script. The output from the script
and Slurm is the following (the command to run it is included).


/cm/shared/apps/openmpi4/gcc/4.1.5/bin/mpirun --mca btl '^openib' -np 1
-map-by ppr:1:node --allow-run-as-root --mca
btl_openib_warn_default_gid_prefix 0 --mca btl_openib_if_exclude
mlx5_0,mlx5_5,mlx5_6 --mca plm_base_verbose 0
 --mca plm rsh /home/bcm/bin/bin/mdtest -i 3 -I 4 -z 3 -b 8 -u -u -d
/raid/bcm/mdtest
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened.  This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
Open MPI stopped checking at the first component that it did not find.

Host:      dgx-14
Framework: pml
Component: ucx
--------------------------------------------------------------------------
[dgx-14:4055623] [[42340,0],0] ORTE_ERROR_LOG: Data unpack would read past
end of buffer in file util/show_help.c at line 501
[dgx-14:4055632] *** An error occurred in MPI_Init
[dgx-14:4055632] *** reported by process [2774794241,0]
[dgx-14:4055632] *** on a NULL communicator
[dgx-14:4055632] *** Unknown error
[dgx-14:4055632] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
will now abort,
[dgx-14:4055632] ***    and potentially your MPI job)


Any pointers/help is greatly appreciated.

Thanks!

Jeff




<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Reply via email to