Ah; this is a slightly different error than what Gilles was guessing from your 
prior description.  This is what you're running in to: 
https://github.com/open-mpi/ompi/blob/v4.0.x/orte/mca/regx/fwd/regx_fwd.c#L130-L134

Try running with:

mpirun --mca regex naive ...

Specifically: the "fwd" regex component is selected by default, but it has 
certain expectations about the format of hostnames.  Try using the "naive" 
regex component, instead.

-- 
Jeff Squyres
[email protected]

________________________________________
From: Patrick Begou <[email protected]>
Sent: Thursday, June 16, 2022 9:48 AM
To: Jeff Squyres (jsquyres); Open MPI Users
Subject: Re: [OMPI users] OpenMPI and names of the nodes in a cluster

Hi  Gilles and Jeff,

@Gilles I will have a look at these files, thanks.

@Jeff this is the error message (screen dump attached) and of course the nodes 
names do not agree with the standard.

Patrick

[cid:[email protected]]

Le 16/06/2022 à 14:30, Jeff Squyres (jsquyres) a écrit :

What exactly is the error that is occurring?

--
Jeff Squyres
[email protected]<mailto:[email protected]>

________________________________________
From: users 
<[email protected]><mailto:[email protected]> on 
behalf of Patrick Begou via users 
<[email protected]><mailto:[email protected]>
Sent: Thursday, June 16, 2022 3:21 AM
To: Open MPI Users
Cc: Patrick Begou
Subject: [OMPI users] OpenMPI and names of the nodes in a cluster

Hi all,

we are facing a serious problem with OpenMPI (4.0.2) that we have
deployed on a cluster. We do not manage this large cluster and the names
of the nodes do not agree with Internet standards for protocols: they
contain a "_" (underscore) character.

So OpenMPI complains about this and do not run.

I've tried to use IP instead of host names in the host file without any
success.

Is there a known workaround for this as requesting the administrators to
change the nodes names on this large cluster may be difficult.

Thanks

Patrick




Reply via email to