Howard,

I don’t know where that ^X following the hostname came from.   The node is 
definitely named n001.    I will try to create a reproducer.

Thanks,
Kurt

From: Pritchard Jr., Howard <[email protected]>
Sent: Monday, July 1, 2024 11:03 AM
To: Open MPI Users <[email protected]>
Cc: Mccall, Kurt E. (MSFC-EV41) <[email protected]>
Subject: Re: [EXTERNAL] [OMPI users] Slurm or OpenMPI error?

Hello Kurt,

The host name looks a little odd.  Do you by chance have a reproducer and 
instructions on how you’re running it that we could try?

Howard

From: users 
<[email protected]<mailto:[email protected]>> on 
behalf of "Mccall, Kurt E. (MSFC-EV41) via users" 
<[email protected]<mailto:[email protected]>>
Reply-To: Open MPI Users 
<[email protected]<mailto:[email protected]>>
Date: Monday, July 1, 2024 at 9:36 AM
To: "OpenMpi User List 
([email protected]<mailto:[email protected]>)" 
<[email protected]<mailto:[email protected]>>
Cc: "Mccall, Kurt E. (MSFC-EV41)" 
<[email protected]<mailto:[email protected]>>
Subject: [EXTERNAL] [OMPI users] Slurm or OpenMPI error?

Using OpenMPI 5.0.3 and Slurm slurm 20.11.8.

Is this error message issued by Slurm or by OpenMPI?  A google search on the 
error message yielded nothing.

--------------------------------------------------------------------------
At least one of the requested hosts is not included in the current
allocation.

   Missing requested host: n001^X

Please check your allocation or your request.
--------------------------------------------------------------------------



Following that error, MPI_Comm_Spawn failed on the named node, n001.


[n001:00000] *** An error occurred in MPI_Comm_spawn
[n001:00000] *** reported by process [595787777,0]
[n001:00000] *** on communicator MPI_COMM_SELF
[n001:00000] *** MPI_ERR_UNKNOWN: unknown error
[n001:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,
[n001:00000] ***    and MPI will try to terminate your MPI job as well)
^@1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal
^@1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal

Thanks,
Kurt

Reply via email to