Re: [OMPI users] Another OpenMPI 5.0.1. Installation Failure

2024-04-21 Thread Gilles Gouaillardet via users
Hi, Is there any reason why you do not build the latest 5.0.2 package? Anyway, the issue could be related to an unknown filesystem. Do you get a meaningful error if you manually run /.../test/util/opal_path_nfs? If not, can you share the output of mount | cut -f3,5 -d' ' Cheers, Gilles

Re: [OMPI users] "MCW rank 0 is not bound (or bound to all available processors)" when running multiple jobs concurrently

2024-04-15 Thread Gilles Gouaillardet via users
test"? Though the warning might be ignored, SIGILL is definitely an issue. I encourage you to have your app dump a core in order to figure out where this is coming from Cheers, Gilles On Tue, Apr 16, 2024 at 5:20 AM Greg Samonds via users < users@lists.open-mpi.org> wrote: > H

Re: [OMPI users] Subject: Clarification about mpirun behavior in Slurm jobs

2024-02-24 Thread Gilles Gouaillardet via users
Christopher, I do not think Open MPI explicitly asks SLURM which cores have been assigned on each node. So if you are planning to run multiple jobs on the same node, your best bet is probably to have SLURM use cpusets. Cheers, Gilles On Sat, Feb 24, 2024 at 7:25 AM Christopher Daley via users

Re: [OMPI users] Seg error when using v5.0.1

2024-01-31 Thread Gilles Gouaillardet via users
, are you able to craft a reproducer that causes the crash? How many nodes and MPI tasks are needed in order to evidence the crash? Cheers, Gilles On Wed, Jan 31, 2024 at 10:09 PM afernandez via users < users@lists.open-mpi.org> wrote: > Hello Joseph, > Sorry for the delay but I didn

Re: [OMPI users] OpenMPI 5.0.1 Installation Failure

2024-01-26 Thread Gilles Gouaillardet via users
Hi, Please open a GitHub issue at https://github.com/open-ompi/ompi/issues and provide the requested information Cheers, Gilles On Sat, Jan 27, 2024 at 12:04 PM Kook Jin Noh via users < users@lists.open-mpi.org> wrote: > Hi, > > > > I’m installing OpenMPI 5.0.1 on Archli

Re: [OMPI users] Binding to thread 0

2023-09-08 Thread Gilles Gouaillardet via users
Luis, you can pass the --bind-to hwthread option in order to bind on the first thread of each core Cheers, Gilles On Fri, Sep 8, 2023 at 8:30 PM Luis Cebamanos via users < users@lists.open-mpi.org> wrote: > Hello, > > Up to now, I have been using numerous ways of bindin

Re: [OMPI users] MPI_Init_thread error

2023-07-25 Thread Gilles Gouaillardet via users
MPI was built with SLURM support (e.g. configure --with-slurm ...) and then srun --mpi=pmi2 ... Cheers, Gilles On Tue, Jul 25, 2023 at 5:07 PM Aziz Ogutlu via users < users@lists.open-mpi.org> wrote: > Hi there all, > We're using Slurm 21.08 on Redhat 7.9 HPC cluster with OpenMPI

Re: [OMPI users] libnuma.so error

2023-07-19 Thread Gilles Gouaillardet via users
, Gilles On Wed, Jul 19, 2023 at 11:36 PM Luis Cebamanos via users < users@lists.open-mpi.org> wrote: > Hello, > > I was wondering if anyone has ever seen the following runtime error: > > mpirun -np 32 ./hello > . > [LOG_CAT_SBGP] libnuma.so: cannot open share

Re: [OMPI users] OpenMPI crashes with TCP connection error

2023-06-16 Thread Gilles Gouaillardet via users
options. Cheers, Gilles On Sat, Jun 17, 2023 at 2:53 AM Mccall, Kurt E. (MSFC-EV41) via users < users@lists.open-mpi.org> wrote: > Joachim, > > > > Sorry to make you resort to divination. My sbatch command is as follows: > > > > sbatch --ntasks-per-node=24 --

Re: [OMPI users] Issue with Running MPI Job on CentOS 7

2023-05-31 Thread Gilles Gouaillardet via users
vendors and/or very different versions. Cheers, Gilles On Thu, Jun 1, 2023 at 10:27 AM 深空探测 via users wrote: > Hi all, > > I am writing to seek assistance regarding an issue I encountered while > running an MPI job on CentOS 7 virtual machine . > > To provide some context, I

Re: [OMPI users] psec warning when launching with srun

2023-05-20 Thread Gilles Gouaillardet via users
test -z "$with_pmix" || test "$with_pmix" = "yes") then : if test "$opal_external_pmix_version" != "3x" and replace the last line with if test $opal_external_pmix_version_major -lt 3 Cheers, Gilles On Sat, May 20, 2023 at 6:13 PM christo

Re: [OMPI users] Issue with unexpected IP address in OpenMPI

2023-03-27 Thread Gilles Gouaillardet via users
should see the expected ip of the second node. If not, there is NAT somewhere and that does not fly well with Open MPI Cheers, Gilles On 3/28/2023 8:53 AM, Todd Spencer via users wrote: OpenMPI Users, I hope this email finds you all well. I am writing to bring to your attention an issue

Re: [OMPI users] What is the best choice of pml and btl for intranode communication

2023-03-05 Thread Gilles Gouaillardet via users
node, you might be able to get the best performances by forcing pml/ob1. Bottom line, I think it is best for you to benchmark your application and pick the combination that leads to the best performances, and you are more than welcome to share your conclusions. Cheers, Gilles On Mon, Mar 6

Re: [OMPI users] Open MPI 4.0.3 outside as well as inside a SimpleFOAM container: step creation temporarily disabled, retrying Requested nodes are busy

2023-02-28 Thread Gilles Gouaillardet via users
instead of mpirun ... Cheers, Gilles On 3/1/2023 12:44 PM, Rob Kudyba via users wrote: Singularity 3.5.3 on RHEL 7 cluster w/ OpenMPI 4.0.3 lives inside a SimpleFOAM version 10 container. I've confirmed the OpenMPI versions are the same. Perhaps this is a question for Singularity users

Re: [OMPI users] ucx configuration

2023-01-11 Thread Gilles Gouaillardet via users
You can pick one test, make it standalone, and open an issue on GitHub. How does (vanilla) Open MPI compare to your vendor Open MPI based library? Cheers, Gilles On Wed, Jan 11, 2023 at 10:20 PM Dave Love via users < users@lists.open-mpi.org> wrote: > Gilles Gouaillardet via user

Re: [OMPI users] ucx configuration

2023-01-07 Thread Gilles Gouaillardet via users
Dave, If there is a bug you would like to report, please open an issue at https://github.com/open-mpi/ompi/issues and provide all the required information (in this case, it should also include the UCX library you are using and how it was obtained or built). Cheers, Gilles On Fri, Jan 6, 2023

Re: [OMPI users] Question about "mca" parameters

2022-11-29 Thread Gilles Gouaillardet via users
Hi, Simply add btl = tcp,self If the openib error message persists, try also adding osc_rdma_btls = ugni,uct,ucp or simply osc = ^rdma Cheers, Gilles On 11/29/2022 5:16 PM, Gestió Servidors via users wrote: Hi, If I run “mpirun --mca btl tcp,self --mca allow_ib 0 -n 12

Re: [OMPI users] CephFS and striping_factor

2022-11-28 Thread Gilles Gouaillardet via users
omponent can be used as an inspiration. I cannot commit to do this, but if you are willing to take a crack at it, I can create such a component so you can go directly to implementing the callback without spending too much time on some Open MPI internals (e.g. component creation). Cheers, Gi

Re: [OMPI users] Run on dual-socket system

2022-11-26 Thread Gilles Gouaillardet via users
it is bound. Cheers, Gilles On Sat, Nov 26, 2022 at 5:38 PM Arham Amouei via users < users@lists.open-mpi.org> wrote: > Hi > > If I run a code with > > mpirun -n 28 ./code > > Is it guaranteed that Open MPI and/or OS give equal number of processes to > each socket

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-14 Thread Gilles Gouaillardet via users
FWIW, I cannot reproduce this error. What is the netmask on both hosts? Cheers, Gilles On 11/15/2022 1:32 PM, timesir via users wrote: *(py3.9) ➜  /share   mpirun -n 2 --machinefile hosts --mca rmaps_base_verbose 100 --mca ras_base_verbose 100  which mpirun* [computer01:39342] mca

Re: [OMPI users] [OMPI devel] There are not enough slots available in the system to satisfy the 2, slots that were requested by the application

2022-11-13 Thread Gilles Gouaillardet via users
There is a typo in your command line. You should use --mca (minus minus) instead of -mca Also, you can try --machinefile instead of -machinefile Cheers, Gilles There are not enough slots available in the system to satisfy the 2 slots that were requested by the application: –mca On Mon, Nov

Re: [OMPI users] lots of undefined symbols compiling a hello-world

2022-11-05 Thread Gilles Gouaillardet via users
Chris, Did you double check libopen-rte.so.40 and libopen-pal.so.40 are installed in /mnt/software/o/openmpi/4.1.4-ct-test/lib? If they are not present, it means your install is busted and you should try to reinstall it. Cheers, Gilles On Sat, Nov 5, 2022 at 3:42 AM Chris Taylor via users

Re: [OMPI users] ifort and openmpi

2022-09-15 Thread Gilles Gouaillardet via users
to support Intel Fortran on OSX. I am confident a Pull Request that does fix this issue will be considered for inclusion in future Open MPI releases. Cheers, Gilles On Fri, Sep 16, 2022 at 11:20 AM Volker Blum via users < users@lists.open-mpi.org> wrote: > Hi all, > > This issue

Re: [OMPI users] Hardware topology influence

2022-09-13 Thread Gilles Gouaillardet via users
in a machine file (e.g. mpirun -machinefile ...) or the command line (e.g. mpirun --hosts host0:96,host1:96 ...) c) if none of the above is set, the number of detected cores on the system Cheers, Gilles On Tue, Sep 13, 2022 at 9:23 PM Lucas Chaloyard via users < users@lists.o

Re: [OMPI users] Using MPI in Toro unikernel

2022-07-24 Thread Gilles Gouaillardet via users
slower. Cheers, Gilles On Sun, Jul 24, 2022 at 8:11 PM Matias Ezequiel Vara Larsen via users < users@lists.open-mpi.org> wrote: > Hello everyone, > > I have started to play with MPI and unikernels and I have recently > implemented a minimal set of MPI APIs on top of Toro Uni

Re: [OMPI users] Intercommunicator issue (any standard about communicator?)

2022-06-24 Thread Gilles Gouaillardet via users
to the wrong mailing list Cheers, Gilles On Fri, Jun 24, 2022 at 9:06 PM Guillaume De Nayer via users < users@lists.open-mpi.org> wrote: > Hi Gilles, > > MPI_COMM_WORLD is positive (4400). > > In a short code I wrote I have something like that: > > MPI_Comm_dup(MPI_COMM

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-24 Thread Gilles Gouaillardet via users
Sorry if I did not make my intent clear. I was basically suggesting to hack the Open MPI and PMIx wrappers to hostname() and remove the problematic underscores to make the regx components a happy panda again. Cheers, Gilles - Original Message - > I think the files suggested by Gil

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-16 Thread Gilles Gouaillardet via users
Patrick, you will likely also need to apply the same hack to opal_net_get_hostname() in opal/util/net.c Cheers, Gilles On Thu, Jun 16, 2022 at 7:30 PM Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Patrick, > > I am not sure Open MPI can do that out of the

Re: [OMPI users] OpenMPI and names of the nodes in a cluster

2022-06-16 Thread Gilles Gouaillardet via users
Patrick, I am not sure Open MPI can do that out of the box. Maybe hacking pmix_net_get_hostname() in opal/mca/pmix/pmix3x/pmix/src/util/net.c can do the trick. Cheers, Gilles On Thu, Jun 16, 2022 at 4:24 PM Patrick Begou via users < users@lists.open-mpi.org> wrote: > Hi al

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Gilles Gouaillardet via users
not match the hostname Cheers, Gilles On Thu, May 5, 2022 at 5:06 AM Scott Sayres via users < users@lists.open-mpi.org> wrote: > foo.sh is executable, again hangs without output. > I command c x2 to return to shell, then > > ps auxwww | egrep 'mpirun|foo.sh' > output shown

Re: [OMPI users] mpi-test-suite shows errors on openmpi 4.1.x

2022-05-03 Thread Gilles Gouaillardet via users
command with mpirun --mca pml ob1 --mca btl tcp,self ... Cheers, Gilles On Tue, May 3, 2022 at 7:08 PM Alois Schlögl via users < users@lists.open-mpi.org> wrote: > > Within our cluster (debian10/slurm16, debian11/slurm20), with > infiniband, and we have several instances of op

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Gilles Gouaillardet via users
You can first double check you MPI_Init_thread(..., MPI_THREAD_MULTIPLE, ...) And the provided level is MPI_THREAD_MULTIPLE as you requested. Cheers, Gilles On Fri, Apr 22, 2022, 21:45 Angel de Vicente via users < users@lists.open-mpi.org> wrote: > Hello, > > I'm runni

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-21 Thread Gilles Gouaillardet via users
to the Intel compiler, I strongly encourage you to run on Intel/AMD processors. Otherwise, use a native compiler for aarch64, and in this case, brew is not a bad option. Cheers, Gilles On Thu, Apr 21, 2022 at 6:36 PM Cici Feng via users < users@lists.open-mpi.org> wrote: > Hi there, > &

Re: [OMPI users] Is there a MPI routine that returns the value of "npernode" being used?

2022-04-02 Thread Gilles Gouaillardet via users
Ernesto, Not directly. But you can use MPI_Comm_split_type(..., MPI_COMM_TYPE_SHARED, ...) and then MPI_Comm_size(...) on the "returned" communicator. Cheers, Gilles On Sun, Apr 3, 2022 at 5:52 AM Ernesto Prudencio via users < users@lists.open-mpi.org> wrote: > Than

Re: [OMPI users] 101 question on MPI_Bcast()

2022-04-02 Thread Gilles Gouaillardet via users
Ernesto, MPI_Bcast() has no barrier semantic. It means the root rank can return after the message is sent (kind of eager send) and before it is received by other ranks. Cheers, Gilles On Sat, Apr 2, 2022, 09:33 Ernesto Prudencio via users < users@lists.open-mpi.org> wrote: &g

Re: [OMPI users] Need help for troubleshooting OpenMPI performances

2022-03-24 Thread Gilles Gouaillardet via users
suggest you give UCX - built with multi threading support - a try and see how it goes Cheers, Gilles On Thu, Mar 24, 2022 at 5:43 PM Patrick Begou via users < users@lists.open-mpi.org> wrote: > Le 28/02/2022 à 17:56, Patrick Begou via users a écrit : > > Hi, > > > > I

Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

2022-03-14 Thread Gilles Gouaillardet via users
in order to exclude the coll/tuned component: mpirun --mca coll ^tuned ... Cheers, Gilles On Mon, Mar 14, 2022 at 5:37 PM Ernesto Prudencio via users < users@lists.open-mpi.org> wrote: > Thanks for the hint on “mpirun ldd”. I will try it. The problem is that I > am running

Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

2022-03-14 Thread Gilles Gouaillardet via users
under the hood with matching but different signatures. Cheers, Gilles On Mon, Mar 14, 2022 at 4:09 PM Ernesto Prudencio via users < users@lists.open-mpi.org> wrote: > Thanks, Gilles. > > > > In the case of the application I am working on, all ranks call MPI with > th

Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

2022-03-14 Thread Gilles Gouaillardet via users
PETSc doing something different but legit that evidences a bug in Open MPI. If you have time, you can also try - Intel compilers - MPICH (or a derivative such as Intel MPI) - PETSc 3.16.5 => a success would strongly point to Open MPI Cheers, Gilles On Mon, Mar 14, 2022 at 2:56 PM Ernesto

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

2022-02-17 Thread Gilles Gouaillardet via users
Angel, Infiniband detection likely fails before checking expanded verbs. Please compress and post the full configure output Cheers, Gilles On Fri, Feb 18, 2022 at 12:02 AM Angel de Vicente via users < users@lists.open-mpi.org> wrote: > Hi, > > I'm trying to compile the latest

Re: [OMPI users] libmpi_mpifh.so.40 - error

2022-01-30 Thread Gilles Gouaillardet via users
Hari, What does ldd solver.exe (or whatever your clever exe file is called) reports? Cheers, Gilles On Mon, Jan 31, 2022 at 2:09 PM Hari Devaraj via users < users@lists.open-mpi.org> wrote: > Hello, > > I am trying to run a FEA solver exe file. > I get this error messag

Re: [OMPI users] RES: OpenMPI - Intel MPI

2022-01-27 Thread Gilles Gouaillardet via users
xotic" interconnects (that might not be supported natively by Open MPI or abstraction layers) and/or with an uncommon topology (for which collective communications are not fully optimized by Open MPI). In the latter case, using the system/vendor MPI is the best option performance wise. Cheers, Gill

Re: [OMPI users] RES: OpenMPI - Intel MPI

2022-01-26 Thread Gilles Gouaillardet via users
Fair enough Ralph! I was implicitly assuming a "build once / run everywhere" use case, my bad for not making my assumption clear. If the container is built to run on a specific host, there are indeed other options to achieve near native performances. Cheers, Gilles On Thu, Jan 27,

Re: [OMPI users] RES: OpenMPI - Intel MPI

2022-01-26 Thread Gilles Gouaillardet via users
ot; at run time with Intel MPI (MPICH based and ABI compatible). Cheers, Gilles On Thu, Jan 27, 2022 at 1:07 PM Brian Dobbins via users < users@lists.open-mpi.org> wrote: > > Hi Ralph, > > Thanks for the explanation - in hindsight, that makes perfect sense, > since e

Re: [OMPI users] Creating An MPI Job from Procs Launched by a Different Launcher

2022-01-25 Thread Gilles Gouaillardet via users
so your custom launcher would "only" have to implement a limited number of callbacks. Cheers, Gilles - Original Message - Any pointers? On Tue, Jan 25, 2022 at 12:55 PM Ralph Castain via users wrote: Short answer is yes, but it it a bit complicated to do.

Re: [OMPI users] Open MPI + Slurm + lmod

2022-01-25 Thread Gilles Gouaillardet via users
Could this be the root cause? What is the PMIx library version used by SLURM? Ralph, do you see something wrong on why Open MPI and SLURM cannot communicate via PMIx? Cheers, Gilles On Tue, Jan 25, 2022 at 5:47 PM Matthias Leopold < matthias.leop...@meduniwien.ac.at> wrote: > Hi Gilles

Re: [OMPI users] Open MPI + Slurm + lmod

2022-01-24 Thread Gilles Gouaillardet via users
: srun --mpi=list will list the PMI flavors provided by SLURM a) if PMIx is not supported, contact your sysadmin and ask for it b) if PMIx is supported but is not the default, ask for it, for example with srun --mpi=pmix_v3 ... Cheers, Gilles On Tue, Jan 25, 2022 at 12:30 AM

Re: [OMPI users] unexpected behavior when combining MPI_Gather and MPI_Type_vector

2021-12-16 Thread Gilles Gouaillardet via users
rank, n, m, v_glob); and also resize rtype so the second element starts at v_glob[3][0] => upper bound = (3*sizeof(int)) By the way, since this question is not Open MPI specific, sites such as Stack Overflow are a better fit. Cheers, Gilles On Thu, Dec 16, 2021 at 6:46 PM Gilles

Re: [OMPI users] unexpected behavior when combining MPI_Gather and MPI_Type_vector

2021-12-16 Thread Gilles Gouaillardet via users
Jonas, Assuming v_glob is what you expect, you will need to `MPI_Type_create_resized_type()` the received type so the block received from process 1 will be placed at the right position (v_glob[3][1] => upper bound = ((4*3+1) * sizeof(int)) Cheers, Gilles On Thu, Dec 16, 2021 at 6:33 PM Jo

Re: [OMPI users] Reserving slots and filling them after job launch with MPI_Comm_spawn

2021-11-03 Thread Gilles Gouaillardet via users
... Cheers, Gilles On Wed, Nov 3, 2021 at 6:06 PM Mccall, Kurt E. (MSFC-EV41) via users < users@lists.open-mpi.org> wrote: > I’m using OpenMPI 4.1.1 compiled with Nvidia’s nvc++ 20.9, and compiled > with Torque support. > > > > I want to reserve multiple slots on each node,

Re: [OMPI users] Newbie Question.

2021-11-01 Thread Gilles Gouaillardet via users
Hi Ben, have you tried export OMPI_MCA_common_ucx_opal_mem_hooks=1 Cheers, Gilles On Mon, Nov 1, 2021 at 9:22 PM bend linux4ms.net via users < users@lists.open-mpi.org> wrote: > Ok, I a am newbie supporting the a HPC project and learning about MPI. > > I have the fo

Re: [OMPI users] Cannot build working Open MPI 4.1.1 with NAG Fortran/clang on macOS (but I could before!)

2021-10-28 Thread Gilles Gouaillardet via users
ot; simply manually remove "-dynamiclib" here and see if it helps Cheers, Gilles On Fri, Oct 29, 2021 at 12:30 AM Matt Thompson via users < users@lists.open-mpi.org> wrote: > Dear Open MPI Gurus, > > This is a...confusing one. For some reason, I cannot build a working

Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, build hints?

2021-09-30 Thread Gilles Gouaillardet via users
is not happy about it. If you can have all these 3 macros defined by the nvhpc compilers, that would be great! Otherwise, I will let George decide if and how Open MPI addresses this issue Cheers, Gilles On Thu, Sep 30, 2021 at 11:33 PM Carl Ponder via users < users@lists.open-mpi.org>

Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, buildhints?

2021-09-30 Thread Gilles Gouaillardet via users
Ray, there is a typo, the configure option is --enable-mca-no-build=op-avx Cheers, Gilles - Original Message - Added --enable-mca-no-build=op-avx to the configure line. Still dies in the same place. CCLD mca_op_avx.la ./.libs/liblocal_ops_avx512

Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk, build hints?

2021-09-29 Thread Gilles Gouaillardet via users
defined by at least GNU and LLVM compilers), I will take a look at it when I get some time (you won't face this issue if you use GNU compilers for C/C++) Cheers, Gilles On Thu, Sep 30, 2021 at 2:31 AM Ray Muno via users wrote: > > Tried this > > configure CC='nvc -fPIC' CXX='nvc

Re: [OMPI users] (Fedora 34, x86_64-pc-linux-gnu, openmpi-4.1.1.tar.gz): PML ucx cannot be selected

2021-09-13 Thread Gilles Gouaillardet via users
Jorge, I am not that familiar with UCX, but I hope that will help: The changes I mentioned were introduced by https://github.com/open-mpi/ompi/pull/8549 I suspect mpirun --mca pml_ucx_tls any --mca pml_ucx_devices --mca pml ucx ... will do what you expect Cheers, Gilles On Mon, Sep 13

Re: [OMPI users] cross-compilation documentation seems to be missing

2021-09-07 Thread Gilles Gouaillardet via users
. Then you can grep ^ompi_cv_fortran_ config.cache to generate the file you can pass to --with-cross when cross compiling on your x86 system Cheers, Gilles On 9/7/2021 7:35 PM, Jeff Hammond via users wrote: I am attempting to cross-compile Open-MPI for RISC-V on an x86 system.  I get this error

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-26 Thread Gilles Gouaillardet via users
. Cheers, Gilles On 8/26/2021 2:42 PM, Broi, Franco via users wrote: Thanks Gilles but no go... /usr/lib64/openmpi/bin/mpirun -c 1 --mca pml ^ucx /home/franco/spawn_example 47 I'm the parent on fsc07 Starting 47 children   Process 1 ([[48649,2],32]) is on host: fsc08   Process 2 ([[48649,1

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-25 Thread Gilles Gouaillardet via users
... and see if it helps Cheers, Gilles On Thu, Aug 26, 2021 at 2:13 PM Broi, Franco via users < users@lists.open-mpi.org> wrote: > Hi, > > I have 2 example progs that I found on the internet (attached) that > illustrate a problem we are having launching multiple node jobs wi

Re: [OMPI users] vectorized reductions

2021-07-20 Thread Gilles Gouaillardet via users
to evaluate them and report the performance numbers. On 7/20/2021 11:00 PM, Dave Love via users wrote: Gilles Gouaillardet via users writes: One motivation is packaging: a single Open MPI implementation has to be built, that can run on older x86 processors (supporting only SSE) and the latest

Re: [OMPI users] vectorized reductions

2021-07-19 Thread Gilles Gouaillardet via users
One motivation is packaging: a single Open MPI implementation has to be built, that can run on older x86 processors (supporting only SSE) and the latest ones (supporting AVX512). The op/avx component will select at runtime the most efficient implementation for vectorized reductions. On Mon, Jul

Re: [OMPI users] how to suppress "libibverbs: Warning: couldn't load driver ..." messages?

2021-06-23 Thread Gilles Gouaillardet via users
Hi Jeff, Assuming you did **not** explicitly configure Open MPI with --disable-dlopen, you can try mpirun --mca pml ob1 --mca btl vader,self ... Cheers, Gilles On Thu, Jun 24, 2021 at 5:08 AM Jeff Hammond via users < users@lists.open-mpi.org> wrote: > I am running on a single no

Re: [OMPI users] (Fedora 34, x86_64-pc-linux-gnu, openmpi-4.1.1.tar.gz): PML ucx cannot be selected

2021-05-28 Thread Gilles Gouaillardet via users
anymore. if you really want to use pml/ucx on your notebook, you need to manually re-enable these providers. That being said, your best choice here is really not to force any pml, and let Open MPI use pml/ob1 (that has support for both TCP and shared memory) Cheers, Gilles On Sat, May 29, 2021

Re: [OMPI users] [EXTERNAL] Linker errors in Fedora 34 Docker container

2021-05-25 Thread Gilles Gouaillardet via users
should not happen if building from an official tarball though. Cheers, Gilles - Original Message - Hi John, I don’t think an external dependency is going to fix this. In your build area, do you see any .lo files in opal/util/keyval ? Which compiler are you using

Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04?

2021-04-08 Thread Gilles Gouaillardet via users
of a compiler bug. Anyway, I issued https://github.com/open-mpi/ompi/pull/8789 and asked for a review. Cheers, Gilles - Original Message - > Dear Gilles, > As per your suggestion, I tried the inline patch as discussed in https://github.com/open-mpi/ompi/pull/8622#issuec

Re: [OMPI users] Building Open-MPI with Intel C

2021-04-07 Thread Gilles Gouaillardet via users
). IIRC, there is also an option in the Intel compiler to statically link to the runtime. Cheers, Gilles On Wed, Apr 7, 2021 at 9:00 AM Heinz, Michael William via users wrote: > > I’m having a heck of a time building OMPI with Intel C. Compilation goes > fine, installation goes fine, compi

Re: [OMPI users] HWLOC icc error

2021-03-23 Thread Gilles Gouaillardet via users
Luis, this file is never compiled when an external hwloc is used. Please open a github issue and include all the required information Cheers, Gilles On Tue, Mar 23, 2021 at 5:44 PM Luis Cebamanos via users wrote: > > Hello, > > Compiling OpenMPI 4.0.5 with Intel 2020 I

Re: [OMPI users] [External] Help with MPI and macOS Firewall

2021-03-18 Thread Gilles Gouaillardet via users
Matt, you can either mpirun --mca btl self,vader ... or export OMPI_MCA_btl=self,vader mpirun ... you may also add btl = self,vader in your /etc/openmpi-mca-params.conf and then simply mpirun ... Cheers, Gilles On Fri, Mar 19, 2021 at 5:44 AM Matt Thompson via users wrote: > > Pr

Re: [OMPI users] config: gfortran: "could not run a simple Fortran program"

2021-03-07 Thread Gilles Gouaillardet via users
Anthony, Did you make sure you can compile a simple fortran program with gfortran? and gcc? Please compress and attach both openmpi-config.out and config.log, so we can diagnose the issue. Cheers, Gilles On Mon, Mar 8, 2021 at 6:48 AM Anthony Rollett via users wrote: > > I am

Re: [OMPI users] Stable and performant openMPI version for Ubuntu20.04 ?

2021-03-04 Thread Gilles Gouaillardet via users
On top of XPMEM, try to also force btl/vader with mpirun --mca pml ob1 --mca btl vader,self, ... On Fri, Mar 5, 2021 at 8:37 AM Nathan Hjelm via users wrote: > > I would run the v4.x series and install xpmem if you can > (http://github.com/hjelmn/xpmem). You will need to build with >

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users
yes, you need to (re)build Open MPI from source in order to try this trick. On 2/26/2021 3:55 PM, LINUS FERNANDES via users wrote: No change. What do you mean by running configure? Are you expecting me to build OpenMPI from source? On Fri, 26 Feb 2021, 11:16 Gilles Gouaillardet via users

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users
gt;>> >>>> Errno==13 is EACCESS, which generically translates to "permission denied". >>>> Since you're running as root, this suggests that something outside of >>>> your local environment (e.g., outside of that immediate layer of >>>&g

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users
s which I > obviously can't on Termux since it doesn't support OpenJDK. > > On Thu, 25 Feb 2021, 13:37 Gilles Gouaillardet via users, > wrote: >> >> Is SELinux running on ArchLinux under Termux? >> >> On 2/25/2021 4:36 PM, LINUS FERNANDES via users wrote: >> &g

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-25 Thread Gilles Gouaillardet via users
Is SELinux running on ArchLinux under Termux? On 2/25/2021 4:36 PM, LINUS FERNANDES via users wrote: Yes, I did not receive this in my inbox since I set to receive digest. ifconfig output: dummy0: flags=195 mtu 1500         inet6 fe80::38a0:1bff:fe81:d4f5 prefixlen 64 scopeid 0x20    

Re: [OMPI users] MPI executable fails on ArchLinux on Termux

2021-02-24 Thread Gilles Gouaillardet via users
Can you run ifconfig or ip addr in both Termux and ArchLinux for Termux? On 2/25/2021 2:00 PM, LINUS FERNANDES via users wrote: Why do I see the following error messages when executing |mpirun| on ArchLinux for Termux? The same program executes on Termux without any glitches.

Re: [OMPI users] weird mpi error report: Type mismatch between arguments

2021-02-17 Thread Gilles Gouaillardet via users
Diego, IIRC, you now have to build your gfortran 10 apps with -fallow-argument-mismatch Cheers, Gilles - Original Message - Dear OPENMPI users, i'd like to notify you a strange issue that arised right after installing a new up-to-date version of Linux (Kubuntu 20.10, with gcc

Re: [OMPI users] GROMACS with openmpi

2021-02-11 Thread Gilles Gouaillardet via users
This is not an Open MPI question, and hence not a fit for this mailing list. But here we go: first, try cmake -DGMX_MPI=ON ... if it fails, try cmake -DGMX_MPI=ON -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx . .. Cheers, Gilles - Original Message - Hi, MPI

Re: [OMPI users] OpenMPI 4.1.0 misidentifies x86 capabilities

2021-02-10 Thread Gilles Gouaillardet via users
containing AVX512 code. That being said, several changes were made to the op/avx component, so if you are experiencing some crashes, I do invite you to give a try to the latest nightly snapshot for the v4.1.x branch. Cheers, Gilles On Wed, Feb 10, 2021 at 10:43 PM Max R. Dechantsreiter via users wrote

Re: [OMPI users] OMPI 4.1 in Cygwin packages?

2021-02-04 Thread Gilles Gouaillardet via users
of master and worker? Cheers, Gilles On Fri, Feb 5, 2021 at 5:59 AM Martín Morales via users wrote: > > Hello all, > > > > Gilles, unfortunately, the result is the same. Attached the log you ask me. > > > > Jeff, some time ago I tried with OMPI 4.1.0 (Lin

Re: [OMPI users] OMPI 4.1 in Cygwin packages?

2021-02-04 Thread Gilles Gouaillardet via users
btl_tcp_base_verbose 20 ... and then compress and post the logs so we can have a look Cheers, Gilles On Thu, Feb 4, 2021 at 9:33 PM Martín Morales via users wrote: > > Hi Marcos, > > > > Yes, I have a problem with spawning to a “worker” host (on localhost, works). > There are just

Re: [OMPI users] Debugging a crash

2021-01-29 Thread Gilles Gouaillardet via users
(or other filesystem if you use a non standard $TMPDIR Cheers, Gilles On Fri, Jan 29, 2021 at 10:50 PM Diego Zuccato via users wrote: > > Hello all. > > I'm having a problem with a job: if it gets scheduled on a specific node > of our cluster, it f

Re: [OMPI users] Error with building OMPI with PGI

2021-01-19 Thread Gilles Gouaillardet via users
/.../libmpi_mpifh.so | grep igatherv and confirm the symbol does indeed exists Cheers, Gilles On Tue, Jan 19, 2021 at 7:39 PM Passant A. Hafez via users wrote: > > Hello Gus, > > > Thanks for your reply. > > Yes I've read multiple threads for very old versions of OMPI and PGI before >

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-18 Thread Gilles Gouaillardet via users
hold for this kind of BS. However, I have a much lower threshold for your gross mischaracterization of the Open MPI community, its values, and how the work gets done. Cheers, Gilles

Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-11 Thread Gilles Gouaillardet via users
Daniel, the test works in my environment (1 node, 32 GB memory) with all the mentioned parameters. Did you check the memory usage on your nodes and made sure the oom killer did not shoot any process? Cheers, Gilles On Tue, Jan 12, 2021 at 1:48 AM Daniel Torres via users wrote: > &

Re: [OMPI users] Confusing behaviour of compiler wrappers

2021-01-09 Thread Gilles Gouaillardet via users
, Gilles On Sat, Jan 9, 2021 at 8:40 AM Sajid Ali via users wrote: > > Hi OpenMPI-community, > > This is a cross post from the following spack issue : > https://github.com/spack/spack/issues/20756 > > In brief, when I install openmpi@4.1.0 with ucx and xpmem fabrics, the > b

Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-08 Thread Gilles Gouaillardet via users
Daniel, Can you please post the full error message and share a reproducer for this issue? Cheers, Gilles On Fri, Jan 8, 2021 at 10:25 PM Daniel Torres via users wrote: > > Hi all. > > Actually I'm implementing an algorithm that creates a process grid and > divides it into

Re: [OMPI users] MPMD hostfile: executables on same hosts

2020-12-21 Thread Gilles Gouaillardet via users
Vineet, probably *not* what you expect, but I guess you can try $ cat host-file host1 slots=3 host2 slots=3 host3 slots=3 $ mpirun -hostfile host-file -np 2 ./EXE1 : -np 1 ./EXE2 : -np 2 ./EXE1 : -np 1 ./EXE2 : -np 2 ./EXE1 : -np 1 ./EXE2 Cheers, Gilles On Mon, Dec 21, 2020 at 10:26 PM

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Gilles Gouaillardet via users
Hi Patrick, Glad to hear you are now able to move forward. Please keep in mind this is not a fix but a temporary workaround. At first glance, I did not spot any issue in the current code. It turned out that the memory leak disappeared when doing things differently Cheers, Gilles On Mon, Dec

Re: [OMPI users] MPI_type_free question

2020-12-10 Thread Gilles Gouaillardet via users
) but not with the v3.1.x branch   (this suggests there could be an error in the latest Open MPI ... or in the code)  - the attached patch seems to have a positive effect, can you please give it a try? Cheers, Gilles On 12/7/2020 6:15 PM, Patrick Bégou via users wrote: Hi, I've written a small piece

Re: [OMPI users] MPI_type_free question

2020-12-04 Thread Gilles Gouaillardet via users
10 --mca mtl_base_verbose 10 --mca btl_base_verbose 10 ... will point you to the component(s) used. The output is pretty verbose, so feel free to compress and post it if you cannot decipher it Cheers, Gilles On 12/4/2020 4:32 PM, Patrick Bégou via users wrote: Hi George and Gilles

Re: [OMPI users] MPI_type_free question

2020-12-03 Thread Gilles Gouaillardet via users
will greatly help us debugging this issue. Cheers, Gilles On 12/4/2020 7:20 AM, George Bosilca via users wrote: Patrick, I'm afraid there is no simple way to check this. The main reason being that OMPI use handles for MPI objects, and these handles are not tracked by the library

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Gilles Gouaillardet via users
will be very much appreciated in order to improve ompio Cheers, Gilles On Thu, Dec 3, 2020 at 6:05 PM Patrick Bégou via users wrote: > > Thanks Gilles, > > this is the solution. > I will set OMPI_MCA_io=^ompio automatically when loading the parallel > hdf5 module on the cluster.

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Gilles Gouaillardet via users
Patrick, In recent Open MPI releases, the default component for MPI-IO is ompio (and no more romio) unless the file is on a Lustre filesystem. You can force romio with mpirun --mca io ^ompio ... Cheers, Gilles On 12/3/2020 4:20 PM, Patrick Bégou via users wrote: Hi, I'm using

Re: [OMPI users] Unable to run complicated MPI Program

2020-11-28 Thread Gilles Gouaillardet via users
Dean, That typically occurs when some nodes have multiple interfaces, and several nodes have a similar IP on a private/unused interface. I suggest you explicitly restrict the interface Open MPI should be using. For example, you can mpirun --mca btl_tcp_if_include eth0 ... Cheers, Gilles

Re: [OMPI users] 4.0.5 on Linux Pop!_OS

2020-11-07 Thread Gilles Gouaillardet via users
but you should first ask yourself if you really want to run 12 MPI tasks on your machine. Cheers, Gilles On Sun, Nov 8, 2020 at 11:14 AM Paul Cizmas via users wrote: > > Hello: > > I just installed OpenMPI 4.0.5 on a Linux machine running Pop!_OS (made by > System76). The worksta

Re: [OMPI users] ompe support for filesystems

2020-10-31 Thread Gilles Gouaillardet via users
ll Request to get some help fixing the missing bits. Cheers, Gilles On Sun, Nov 1, 2020 at 12:18 PM Ognen Duzlevski via users wrote: > > Hello! > > If I wanted to support a specific filesystem in open mpi, how is this > done? What code in the source tree does it? > > Thanks! > Ognen

Re: [OMPI users] Anyone try building openmpi 4.0.5 w/ llvm 11

2020-10-22 Thread Gilles Gouaillardet via users
Alan, thanks for the report, I addressed this issue in https://github.com/open-mpi/ompi/pull/8116 As a temporary workaround, you can apply the attached patch. FWIW, f18 (shipped with LLVM 11.0.0) is still in development and uses gfortran under the hood. Cheers, Gilles On Wed, Oct 21, 2020

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Gilles Gouaillardet via users
Hi Jorge, If a firewall is running on your nodes, I suggest you disable it and try again Cheers, Gilles On Wed, Oct 21, 2020 at 5:50 AM Jorge SILVA via users wrote: > > Hello, > > I installed kubuntu20.4.1 with openmpi 4.0.3-0ubuntu in two different > computers in the standard

Re: [OMPI users] Issue with shared memory arrays in Fortran

2020-08-24 Thread Gilles Gouaillardet via users
ucx but manually change the bcast algo mpirun --mca coll_tuned_use_dynamic_rules 1 --mca coll_tuned_bcast_algorithm 1 ... /* you can replace the bcast algorithm with any value between 1 and 7 included */ Cheers, Gilles On Mon, Aug 24, 2020 at 10:58 PM Patrick McNally via users wrote: >

Re: [OMPI users] ORTE HNP Daemon Error - Generated by Tweaking MTU

2020-08-09 Thread Gilles Gouaillardet via users
and a multithreaded BLAS (e.g. PxQ = 2x4 and 4 OpenMP threads per MPI task) Cheers, Gilles On Mon, Aug 10, 2020 at 3:31 AM John Duffy via users wrote: > > Hi > > I have generated this problem myself by tweaking the MTU of my 8 node > Raspberry Pi 4 cluster to 9000 bytes, but I would be g

  1   2   3   4   5   6   7   8   9   10   >