Hi Jeff,
Unfortunately the workaround with "|--mca regex naive" does not change
the behaviour. I'm going to investigate OpenMPI sources files as
suggested by Gilles.|
|
|
|Patrick
|
Le 16/06/2022 à 17:43, Jeff Squyres (jsquyres) a écrit :
Ah; this is a slightly different error than what Gilles was guessing from your
prior description. This is what you're running in
to:https://github.com/open-mpi/ompi/blob/v4.0.x/orte/mca/regx/fwd/regx_fwd.c#L130-L134
Try running with:
mpirun --mca regex naive ...
Specifically: the "fwd" regex component is selected by default, but it has certain
expectations about the format of hostnames. Try using the "naive" regex component,
instead.
--
Jeff Squyres
jsquy...@cisco.com
________________________________________
From: Patrick Begou<patrick.be...@univ-grenoble-alpes.fr>
Sent: Thursday, June 16, 2022 9:48 AM
To: Jeff Squyres (jsquyres); Open MPI Users
Subject: Re: [OMPI users] OpenMPI and names of the nodes in a cluster
Hi Gilles and Jeff,
@Gilles I will have a look at these files, thanks.
@Jeff this is the error message (screen dump attached) and of course the nodes
names do not agree with the standard.
Patrick
[cid:part1.KfzAgK4Q.PG6VadQJ@univ-grenoble-alpes.fr]
Le 16/06/2022 à 14:30, Jeff Squyres (jsquyres) a écrit :
What exactly is the error that is occurring?
--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>
________________________________________
From: users<users-boun...@lists.open-mpi.org><mailto:users-boun...@lists.open-mpi.org> on
behalf of Patrick Begou via users<users@lists.open-mpi.org><mailto:users@lists.open-mpi.org>
Sent: Thursday, June 16, 2022 3:21 AM
To: Open MPI Users
Cc: Patrick Begou
Subject: [OMPI users] OpenMPI and names of the nodes in a cluster
Hi all,
we are facing a serious problem with OpenMPI (4.0.2) that we have
deployed on a cluster. We do not manage this large cluster and the names
of the nodes do not agree with Internet standards for protocols: they
contain a "_" (underscore) character.
So OpenMPI complains about this and do not run.
I've tried to use IP instead of host names in the host file without any
success.
Is there a known workaround for this as requesting the administrators to
change the nodes names on this large cluster may be difficult.
Thanks
Patrick