Howard,

I don’t know where that ^X following the hostname came from.   The node is 
definitely named n001.    I will try to create a reproducer.

Thanks,
Kurt

From: Pritchard Jr., Howard <howa...@lanl.gov>
Sent: Monday, July 1, 2024 11:03 AM
To: Open MPI Users <users@lists.open-mpi.org>
Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mcc...@nasa.gov>
Subject: Re: [EXTERNAL] [OMPI users] Slurm or OpenMPI error?

Hello Kurt,

The host name looks a little odd.  Do you by chance have a reproducer and 
instructions on how you’re running it that we could try?

Howard

From: users 
<users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> on 
behalf of "Mccall, Kurt E. (MSFC-EV41) via users" 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Reply-To: Open MPI Users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Date: Monday, July 1, 2024 at 9:36 AM
To: "OpenMpi User List 
(users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>)" 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Cc: "Mccall, Kurt E. (MSFC-EV41)" 
<kurt.e.mcc...@nasa.gov<mailto:kurt.e.mcc...@nasa.gov>>
Subject: [EXTERNAL] [OMPI users] Slurm or OpenMPI error?

Using OpenMPI 5.0.3 and Slurm slurm 20.11.8.

Is this error message issued by Slurm or by OpenMPI?  A google search on the 
error message yielded nothing.

--------------------------------------------------------------------------
At least one of the requested hosts is not included in the current
allocation.

   Missing requested host: n001^X

Please check your allocation or your request.
--------------------------------------------------------------------------



Following that error, MPI_Comm_Spawn failed on the named node, n001.


[n001:00000] *** An error occurred in MPI_Comm_spawn
[n001:00000] *** reported by process [595787777,0]
[n001:00000] *** on communicator MPI_COMM_SELF
[n001:00000] *** MPI_ERR_UNKNOWN: unknown error
[n001:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,
[n001:00000] ***    and MPI will try to terminate your MPI job as well)
^@1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal
^@1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal

Thanks,
Kurt

Reply via email to