Yeah, it looks like orte-clean is busted in 4.0.x and 4.1.x.  I have filed 
https://github.com/open-mpi/ompi/issues/9171 to track the issue.

FWIW, orte-clean shouldn't be necessary for normal Open MPI runs.  We should 
fix it (or remove it), of course, but this shouldn't be a blocker for whatever 
you're trying to do with Open MPI.


On Jul 19, 2021, at 5:32 PM, Sage Imel via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:

Hello,
Whenever I try to run orte-clean on any Ubuntu 20.04 system it errors out with 
an "OPAL ERROR: Unreachable" error. This is using the system packages for 
openmpi, which is version 4.0.3-0ubuntu1. The full error message is provided at 
the end of this email. I've seen this both on the fully managed servers in my 
environment, and on freshly loaded Ubuntu x86_64 VMs.

Am I missing something? Is there some specific
configuration that is needed to make this work, or is this known/expected 
behavior? I'm not really a user of OpenMPI myself, but i'm a sys admin 
supporting researchers who use this command to clean up failed/finished jobs on 
their servers.

Any hints you can provide would be much appreciated.

--
Sage Imel <s...@cat.pdx.edu<mailto:s...@cat.pdx.edu>>

[REDACTED:1918375] OPAL ERROR: Unreachable in file ext3x_client.c at line 252
[REDACTED:1918375] [[INVALID],INVALID] ORTE_ERROR_LOG: Unreachable in file 
base/ess_base_std_tool.c at line 142
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_pmix.init failed
  --> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[REDACTED:1918375] [[INVALID],INVALID] ORTE_ERROR_LOG: Unreachable in file 
ess_tool_module.c at line 129
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_init failed
  --> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------


--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>



Reply via email to