from:"Angel de Vicente"

Re: [OMPI users] Location of the file pmix-mca-params.conf?

2023-06-14 Thread Angel de Vicente via users

Hi, Angel de Vicente via users writes: > I have tried: > + /etc/pmix-mca-params.conf > + /usr/lib/x86_64-linux-gnu/pmix2/etc/pmix-mca.params.conf > but no luck. Never mind, /etc/openmpi/pmix-mca-params.conf was the right one. Cheers, -- Ángel de Vicente

[OMPI users] Location of the file pmix-mca-params.conf?

2023-06-14 Thread Angel de Vicente via users

Hello, with our current setting of OpenMPI and Slurm in a Ubuntu 22.04 server, when we submit MPI jobs I get the message: PMIX ERROR: ERROR in file ../../../../../../src/mca/gds/ds12/gds_ds12_lock_pthread.c at line 169 Following https://github.com/open-mpi/ompi/issues/7516, I tried setting PMIX_

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-26 Thread Angel de Vicente via users

Hello, thanks for your help and suggestions. At the end it was no issue with OpenMPI or with any other system stuff, but rather a single line in our code. I thought I was doing the tests with the -fbounds-check option, but it turns out I was not, arrrghh!! At some point I was writing outside one

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Angel de Vicente via users

Hello, "Keller, Rainer" writes: > You’re using MPI_Probe() with Threads; that’s not safe. > Please consider using MPI_Mprobe() together with MPI_Mrecv(). many thanks for the suggestion. I will try with the M variants, though I was under the impression that mpi_probe() was OK as far as one made

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Angel de Vicente via users

Hello Jeff, "Jeff Squyres (jsquyres)" writes: > With THREAD_FUNNELED, it means that there can only be one thread in > MPI at a time -- and it needs to be the same thread as the one that > called MPI_INIT_THREAD. > > Is that the case in your app? the master rank (i.e. 0) never creates threads,

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Angel de Vicente via users

Thanks Gilles, Gilles Gouaillardet via users writes: > You can first double check you > MPI_Init_thread(..., MPI_THREAD_MULTIPLE, ...) my code uses "mpi_thread_funneled" and OpenMPI was compiled with MPI_THREAD_MULTIPLE support: , | ompi_info | grep -i thread | Thread support: p

[OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Angel de Vicente via users

Hello, I'm running out of ideas, and wonder if someone here could have some tips on how to debug a segmentation fault I'm having with my application [due to the nature of the problem I'm wondering if the problem is with OpenMPI itself rather than my app, though at this point I'm not leaning strong

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

2022-03-11 Thread Angel de Vicente via users

Hello, Joshua Ladd writes: > These are very, very old versions of UCX and HCOLL installed in your > environment. Also, MXM was deprecated years ago in favor of UCX. What > version of MOFED is installed (run ofed_info -s)? What HCA generation > is present (run ibstat). MOFED is: MLNX_OFED_LINUX

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

2022-03-01 Thread Angel de Vicente via users

Hello, John Hearns via users writes: > Stupid answer from me. If latency/bandwidth numbers are bad then check > that you are really running over the interface that you think you > should be. You could be falling back to running over Ethernet. I'm quite out of my depth here, so all answers are h

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

2022-02-28 Thread Angel de Vicente via users

Hello, "Jeff Squyres (jsquyres)" writes: > I'd recommend against using Open MPI v3.1.0 -- it's quite old. If you > have to use Open MPI v3.1.x, I'd at least suggest using v3.1.6, which > has all the rolled-up bug fixes on the v3.1.x series. > > That being said, Open MPI v4.1.2 is the most curre

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

2022-02-18 Thread Angel de Vicente via users

Hello, Gilles Gouaillardet via users writes: > Infiniband detection likely fails before checking expanded verbs. thanks for this. At the end, after playing a bit with different options, I managed to install OpenMPI 3.1.0 OK in our cluster using UCX (I wanted 4.1.1, but that would not compile cl

[OMPI users] Trouble compiling OpenMPI with Infiniband support

2022-02-17 Thread Angel de Vicente via users

Hi, I'm trying to compile the latest OpenMPI version with Infiniband support in our local cluster, but didn't get very far (since I'm installing this via Spack, I also asked in their support group). I'm doing the installation via Spack, which is issuing the following .configure step (see the opti

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-06 Thread Angel de Vicente via users

Hi, Joshua Ladd writes: > This is an ancient version of HCOLL. Please upgrade to the latest > version (you can do this by installing HPC-X > https://www.mellanox.com/products/hpc-x-toolkit) Just to close the circle and inform that all seems OK now. I don't have root permission in this machine

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-05 Thread Angel de Vicente via users

Hi, Joshua Ladd writes: > We cannot reproduce this. On four nodes 20 PPN with and w/o hcoll it > takes exactly the same 19 secs (80 ranks). > > What version of HCOLL are you using? Command line? Thanks for having a look at this. According to ompi_info, our OpenMPI (version 3.0.1) was config

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-04 Thread Angel de Vicente via users

Hi, George Bosilca writes: > If I'm not mistaken, hcoll is playing with the opal_progress in a way > that conflicts with the blessed usage of progress in OMPI and prevents > other components from advancing and timely completing requests. The > impact is minimal for sequential applications using

[OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-03 Thread Angel de Vicente via users

Hi, in one of our codes, we want to create a log of events that happen in the MPI processes, where the number of these events and their timing is unpredictable. So I implemented a simple test code, where process 0 creates a thread that is just busy-waiting for messages from any process, and which

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-13 Thread Angel de Vicente

Brice Goglin writes: > Ok, that's a very old kernel on a very old POWER processor, it's > expected that hwloc doesn't get much topology information, and it's > then expected that OpenMPI cannot apply most binding policies. Just in case it can add anything, I tried with an older OpenMPI version (

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-09 Thread Angel de Vicente

, Brice Goglin wrote: > What's this machine made of? (processor, etc) > What kernel are you running ? > > Getting no "socket" or "package" at all is quite rare these days. > > Brice > > > > > Le 09/03/2017 15:28, Angel de Vicente a écrit : &

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-09 Thread Angel de Vicente

Hi again, thanks for your help. I installed the latest OpenMPI (2.0.2). lstopo output: , | lstopo --version | lstopo 1.11.2 | | lstopo | Machine (7861MB) | L2 L#0 (1024KB) + L1d L#0 (32KB) + L1i L#0 (64KB) + Core L#0 + PU L#0 | (P#0) | L2 L#1 (1024KB) + L1d L#1 (32KB) + L1i L#1 (64KB

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-09 Thread Angel de Vicente

Hi, Gilles Gouaillardet writes: > Can you run > lstopo > in your machine, and post the output ? no lstopo in my machine. This is part of hwloc, right? > can you also try > mpirun --map-by socket --bind-to socket ... > and see if it helps ? same issue. Perhaps I need to compile hwloc as well?

Re: [OMPI users] "No objects of the specified type were found on at least one node"?

2017-03-09 Thread Angel de Vicente

Hi, Gilles Gouaillardet writes: > which version of ompi are you running ? 2.0.1 > this error can occur on systems with no NUMA object (e.g. single > socket with hwloc < 2) > as a workaround, you can > mpirun --map-by socket ... with --map-by socket I get exactly the same issue (both in the log

[OMPI users] "No objects of the specified type were found on at least one node"?

2017-03-09 Thread Angel de Vicente

Hi, I'm trying to get OpenMPI running in a new machine, and I came accross an error message that I hadn't seen before. , | can@login1:> mpirun -np 1 ./code config.txt | -- | No objects of the specified type were found on

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

2017-02-27 Thread Angel de Vicente

Hi, Reuti writes: > At first I thought you want to run a queuing system inside a queuing > system, but this looks like you want to replace the resource manager. yes, if this could work reasonably well, we could do without the resource manager. > Under which user account the DVM daemons will run

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

2017-02-27 Thread Angel de Vicente

Hi, "r...@open-mpi.org" writes: >> With the DVM, is it possible to keep these jobs in some sort of queue, >> so that they will be executed when the cores get free? > > It wouldn’t be hard to do so - as long as it was just a simple FIFO > scheduler. I wouldn’t want it to get too complex. a simpl

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

2017-02-27 Thread Angel de Vicente

Hi, "r...@open-mpi.org" writes: > You might want to try using the DVM (distributed virtual machine) > mode in ORTE. You can start it on an allocation using the “orte-dvm” > cmd, and then submit jobs to it with “mpirun --hnp ”, where foo > is either the contact info printed out by orte-dvm, or the

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Angel de Vicente

Hi, "Jeff Squyres (jsquyres)" writes: > The list of names in the hostfile specifies the servers that will be used, > not the network interfaces. Have a look at the TCP portion of the FAQ: > > http://www.open-mpi.org/faq/?category=tcp Thanks a lot for this. Now it works OK if I run it lik

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Angel de Vicente

Hi again, Angel de Vicente writes: > yes, that's just what I did with orted. I saw the port that it was > trying to connect and telnet to it, and I got "No route to host", so > that's why I was going the firewall path. Hopefully the sysadmins can > disable the f

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Angel de Vicente

Hi, "Jeff Squyres (jsquyres)" writes: >>> I'm starting to think that perhaps is a firewall issue? I don't have >>> root access in these machines but I'll try to investigate. > A simple test is to try any socket-based server app between the two > machines that opens a random listening socket. Tr

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-06 Thread Angel de Vicente

Hi, Ralph Castain writes: > On May 4, 2013, at 4:54 PM, Angel de Vicente wrote: >> >> Is there any way to dump details of what OpenMPI is trying to do in each >> node, so I can see if it is looking for different libraries in each >> node, or something similar?

[OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-04 Thread Angel de Vicente

Hi, I have used OpenMPI before without any troubles, and configured MPICH, MPICH2 and OpenMPI in many different machines before, but recently we upgraded the OS to Fedora 17, and now I'm having trouble running an MPI code in two of our machines connected via a switch. I thought perhaps the old in

Re: [OMPI users] Location of the file pmix-mca-params.conf?

[OMPI users] Location of the file pmix-mca-params.conf?

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

[OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

Re: [OMPI users] Trouble compiling OpenMPI with Infiniband support

[OMPI users] Trouble compiling OpenMPI with Infiniband support

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

[OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

Re: [OMPI users] "No objects of the specified type were found on at least one node"

Re: [OMPI users] "No objects of the specified type were found on at least one node"

Re: [OMPI users] "No objects of the specified type were found on at least one node"

Re: [OMPI users] "No objects of the specified type were found on at least one node"

Re: [OMPI users] "No objects of the specified type were found on at least one node"?

[OMPI users] "No objects of the specified type were found on at least one node"?

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

[OMPI users] Help diagnosing problem: not being able to run MPI code across computers

30 matches

Site Navigation

Mail list logo

Footer information