e network is welcome.
Patrick
Le 30/09/2024 à 18:41, Patrick Begou via users a écrit :
Hi Nathan
thanks for this suggestion. I have understood that now all is managed
by the UCX layer. Am I wrong ?
These options do not seams to work with my openMPI 5.0.5 build. But
I've built OpenMPI on
On Sep 30, 2024, at 10:18 AM, Patrick Begou via users
wrote:
Hi,
I'm working on refreshing an old cluster with Almalinux 9 (instead of
CentOS6 😕) and building a fresh OpenMPI 5.0.5 environment. I've
reached the step where OpenMPI begins to work with ucx 1.17 and Pmix
5.0.3 but
Hi,
I'm working on refreshing an old cluster with Almalinux 9 (instead of
CentOS6 😕) and building a fresh OpenMPI 5.0.5 environment. I've reached
the step where OpenMPI begins to work with ucx 1.17 and Pmix 5.0.3 but
not totally. Nodes are using a Qlogic QDR HBA with a managed Qlogic
switch (
KfzAgK4Q.PG6VadQJ@univ-grenoble-alpes.fr]
Le 16/06/2022 à 14:30, Jeff Squyres (jsquyres) a écrit :
What exactly is the error that is occurring?
--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>
From: users<mailto:users-boun...@lis
occurring?
--
Jeff Squyres
jsquy...@cisco.com
From: users on behalf of Patrick Begou via
users
Sent: Thursday, June 16, 2022 3:21 AM
To: Open MPI Users
Cc: Patrick Begou
Subject: [OMPI users] OpenMPI and names of the nodes in a cluster
Hi all,
we are
Hi all,
we are facing a serious problem with OpenMPI (4.0.2) that we have
deployed on a cluster. We do not manage this large cluster and the names
of the nodes do not agree with Internet standards for protocols: they
contain a "_" (underscore) character.
So OpenMPI complains about this and d
e legacy openib btl?
If the former, is it built with multi threading support?
If the latter, I suggest you give UCX - built with multi threading
support - a try and see how it goes
Cheers,
Gilles
On Thu, Mar 24, 2022 at 5:43 PM Patrick Begou via users
wrote:
Le 28/02/2022 à 17:56, Pa
Le 28/02/2022 à 17:56, Patrick Begou via users a écrit :
Hi,
I meet a performance problem with OpenMPI on my cluster. In some
situation my parallel code is really slow (same binary running on a
different mesh).
To investigate, the fortran code code is built with profiling option
(mpifort
Hi,
I meet a performance problem with OpenMPI on my cluster. In some
situation my parallel code is really slow (same binary running on a
different mesh).
To investigate, the fortran code code is built with profiling option
(mpifort -p -O3.) and launched on 91 cores.
One mon.out file pe
debian, so i can't be much
> more help
>
> if i had to guess totally pulling junk from the air, there's probably
> something incompatible with PSM and OPA when running specifically on debian
> (likely due to library versioning). i don't know how common that is, so
ot
> sure if it's supposed to stop at some point
>
> i'm running rhel7, gcc 10.1, openmpi 4.0.5rc2, with-ofi,
> without-{psm,ucx,verbs}
>
> On Tue, Jan 26, 2021 at 3:44 PM Patrick Begou via users
> wrote:
> >
> >
MPI app that reproduces
> the problem? I can’t think of another way I can give you more help
> without being able to see what’s going on. It’s always possible
> there’s a bug in the PSM2 MTL but it would be surprising at this point.
>
> Sent from my iPad
>
>> On Jan 26, 20
Hi all,
I ran many tests today. I saw that an older 4.0.2 version of OpenMPI
packaged with Nix was running using openib. So I add the --with-verbs
option to setup this module.
That I can see now is that:
mpirun -hostfile $OAR_NODEFILE *--mca mtl psm -mca btl_openib_allow_ib
true*
- the
07 but expect 4007
but it fails too.
Patrick
Le 25/01/2021 à 19:34, Ralph Castain via users a écrit :
> I think you mean add "--mca mtl ofi" to the mpirun cmd line
>
>
>> On Jan 25, 2021, at 10:18 AM, Heinz, Michael William via users
>> wrote:
>>
>>
Hi Howard and Michael,
thanks for your feedback. I did not want to write a toot long mail with
non pertinent information so I just show how the two different builds
give different result. I'm using a small test case based on my large
code, the same used to show the memory leak with mpi_Alltoallv c
Hi,
I'm trying to deploy OpenMPI 4.0.5 on the university's supercomputer:
* Debian GNU/Linux 9 (stretch)
* Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] (rev 11)
and for several days I have a bug (wrong results using MPI_AllToAllW) on
this server when using OmniPath.
Running
16 matches
Mail list logo