Re: [OMPI users] MPI_type_free question

2020-12-15 Thread Patrick Bégou via users
Issue #8290 reported. Thanks all for your help and the workaround provided. Patrick Le 14/12/2020 à 17:40, Jeff Squyres (jsquyres) a écrit : > Yes, opening an issue would be great -- thanks! > > >> On Dec 14, 2020, at 11:32 AM, Patrick Bégou via users >> mailto:users@lists

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Patrick Bégou via users
und. > At first glance, I did not spot any issue in the current code. > It turned out that the memory leak disappeared when doing things > differently > > Cheers, > > Gilles > > On Mon, Dec 14, 2020 at 7:11 PM Patrick Bégou via users > mailto:users@lists.open-mpi.org&

Re: [OMPI users] MPI_type_free question

2020-12-14 Thread Patrick Bégou via users
a try? > > > Cheers, > > > Gilles > > > > On 12/7/2020 6:15 PM, Patrick Bégou via users wrote: >> Hi, >> >> I've written a small piece of code to show the problem. Based on my >> application but 2D and using integers arrays for testing. >

Re: [OMPI users] MPI_type_free question

2020-12-10 Thread Patrick Bégou via users
efore be used to convert back >>> into a valid datatype pointer, until OMPI completely releases the >>> datatype. Look into the ompi_datatype_f_to_c_table table to see the >>> datatypes that exist and get their pointers, and then use these >>> pointers as arguments to ompi_da

Re: [OMPI users] MPI_type_free question

2020-12-07 Thread Patrick Bégou via users
but deeper in the code I think. Patrick Le 04/12/2020 à 19:20, George Bosilca a écrit : > On Fri, Dec 4, 2020 at 2:33 AM Patrick Bégou via users > mailto:users@lists.open-mpi.org>> wrote: > > Hi George and Gilles, > > Thanks George for your suggestion. Is it valua

Re: [OMPI users] MPI_type_free question

2020-12-07 Thread Patrick Bégou via users
refore be used to convert back into a >> valid datatype pointer, until OMPI completely releases the datatype. >> Look into the ompi_datatype_f_to_c_table table to see the datatypes >> that exist and get their pointers, and then use these pointers as >> arguments to ompi_datatype

Re: [OMPI users] MPI_type_free question

2020-12-04 Thread Patrick Bégou via users
; are used? > > > mpirun --mca pml_base_verbose 10 --mca mtl_base_verbose 10 --mca > btl_base_verbose 10 ... > > will point you to the component(s) used. > > The output is pretty verbose, so feel free to compress and post it if > you cannot decipher it > > > Cheers, > >

Re: [OMPI users] MPI_type_free question

2020-12-03 Thread Patrick Bégou via users
ntil OMPI completely releases the datatype. >> Look into the ompi_datatype_f_to_c_table table to see the datatypes >> that exist and get their pointers, and then use these pointers as >> arguments to ompi_datatype_dump() to see if any of these existing >> datatypes are the ones you d

[OMPI users] MPI_type_free question

2020-12-03 Thread Patrick Bégou via users
Hi, I'm trying to solve a memory leak since my new implementation of communications based on MPI_AllToAllW and MPI_type_Create_SubArray calls.  Arrays of SubArray types are created/destroyed at each time step and used for communications. On my laptop the code runs fine (running for 15000

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Patrick Bégou via users
; > Sharing a reproducer will be very much appreciated in order to improve ompio > > Cheers, > > Gilles > > On Thu, Dec 3, 2020 at 6:05 PM Patrick Bégou via users > wrote: >> Thanks Gilles, >> >> this is the solution. >> I will set OMPI_MCA_io=^ompio aut

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Patrick Bégou via users
. > > > You can force romio with > > mpirun --mca io ^ompio ... > > > Cheers, > > > Gilles > > On 12/3/2020 4:20 PM, Patrick Bégou via users wrote: >> Hi, >> >> I'm using an old (but required by the codes) version of hdf5 (1.8.12) in >>

[OMPI users] Parallel HDF5 low performance

2020-12-02 Thread Patrick Bégou via users
Hi, I'm using an old (but required by the codes) version of hdf5 (1.8.12) in parallel mode in 2 fortran applications. It relies on MPI/IO. The storage is NFS mounted on the nodes of a small cluster. With OpenMPI 1.7 it runs fine but using modern OpenMPI 3.1 or 4.0.5 the I/Os are 10x to 100x

Re: [OMPI users] mpirun only work for 1 processor

2020-06-04 Thread Patrick Bégou via users
> the mpirun --version in all 3 nodes are identical, openmpi 2.1.1, and > same place when testing with "whereis mpirun" > So is there any problem with mpirun causing it to not launch to other > nodes? > > Regards > HaChi > > On Thu, 4 Jun 2020 at 14:35, Patric

Re: [OMPI users] mpirun only work for 1 processor

2020-06-04 Thread Patrick Bégou via users
Hi Ha Chi do you use a batch scheduler with Rocks Cluster or do you log on the node with ssh ? If ssh, can you check  that you can ssh from one node to the other without password ? Ping just says the network is alive, not that you can connect. Patrick Le 04/06/2020 à 09:06, Hà Chi Nguyễn Nhật

[OMPI users] OpenMPI 4.3 without ucx

2020-05-10 Thread Patrick Bégou via users
Hi all, I've built OpenMPI 4.3.0 with GCC 9.3.0 but on the server ucx was not available when I set --with-ucx. I remove this option and it compiles fine without ucx. However I have a strange behavior as when using mpirun I must explicitely remove ucx to avoid an error: in my module file I have to

Re: [OMPI users] can't open /dev/ipath, network down (err=26)

2020-05-09 Thread Patrick Bégou via users
-and-i-o/fabric-products/OFED_Host_Software_UserGuide_G91902_06.pdf#page72> > > TS is the same hardware as the old QLogic QDR HCAs so the manual might > be helpful to you in the future. > > Sent from my iPad > >> On May 9, 2020, at 9:52 AM, Patrick Bégou via users >> wro

Re: [OMPI users] can't open /dev/ipath, network down (err=26)

2020-05-09 Thread Patrick Bégou via users
Le 08/05/2020 à 21:56, Prentice Bisbal via users a écrit : > > We often get the following errors when more than one job runs on the > same compute node. We are using Slurm with OpenMPI. The IB cards are > QLogic using PSM: > > 10698ipath_userinit: assign_context command failed: Network is down >

Re: [OMPI users] Can't start jobs with srun.

2020-05-07 Thread Patrick Bégou via users
so the problem was not critical for my futur. Patrick > > On Sun, 26 Apr 2020 at 18:09, Patrick Bégou via users > mailto:users@lists.open-mpi.org>> wrote: > > I have also this problem on servers I'm benching at DELL's lab with > OpenMPI-4.0.3. I've tr

Re: [OMPI users] Can't start jobs with srun.

2020-04-26 Thread Patrick Bégou via users
I have also this problem on servers I'm benching at DELL's lab with OpenMPI-4.0.3. I've tried  a new build of OpenMPI with "--with-pmi2". No change. Finally my work around in the slurm script was to launch my code with mpirun. As mpirun was only finding one slot per nodes I have used

Re: [OMPI users] opal_path_nfs freeze

2020-04-23 Thread Patrick Bégou via users
of saying: make sure that you have no other Open > MPI installation findable in your PATH / LD_LIBRARY_PATH and then try > running `make check` again. > > >> On Apr 21, 2020, at 2:37 PM, Patrick Bégou via users >> mailto:users@lists.open-mpi.org>> wrote: >> >> Hi Ope

[OMPI users] opal_path_nfs freeze

2020-04-21 Thread Patrick Bégou via users
Hi OpenMPI maintainers, I have temporary access to servers with AMD Epyc processors running RHEL7. I'm trying to deploy OpenMPI with several setup but each time "make check" fails on *opal_path_nfs*. This test freeze for ever consuming no cpu resources. After nearly one hour I have killed the

Re: [OMPI users] file/process write speed is not scalable

2020-04-14 Thread Patrick Bégou via users
Hi David, could you specify which version of OpenMPI you are using ? I've also some parallel I/O trouble with one code but still have not investigated. Thanks Patrick Le 13/04/2020 à 17:11, Dong-In Kang via users a écrit : > >  Thank you for your suggestion. > I am more concerned about the poor

Re: [OMPI users] OpenMPI 3 without network connection

2019-01-29 Thread Patrick Bégou
gt;> >> The root cause is we do not include the localhost interface by >> default for OOB communications. >> >> >> You should be able to run with >> >> mpirun --mca oob_tcp_if_include lo -np 4 hostname >> >> >> Cheers, >> >>

Re: [OMPI users] OpenMPI 3 without network connection

2019-01-28 Thread Patrick Bégou
> Does “no network is available” means the lo interface (localhost > 127.0.0.1) is not even available ? > > Cheers, > > Gilles > > On Monday, January 28, 2019, Patrick Bégou > <mailto:patrick.be...@legi.grenoble-inp.fr>> wrote: > > Hi, > >

[OMPI users] OpenMPI 3 without network connection

2019-01-28 Thread Patrick Bégou
Hi, I fall in a strange problem with OpenMPI 3.1 installed on a CentOS7 laptop. If no  network is available I cannot launch a local mpi job on the laptop: bash-4.2$ mpirun -np 4 hostname -- No network interfaces were found

Re: [OMPI users] [version 2.1.5] invalid memory reference

2018-10-12 Thread Patrick Bégou
I have downloaded the nightly snapshot tarball of october 10th 2018 for the 3.1 version and it solves the memory problem. I ran my test case on 1, 2, 4, 10, 16, 20, 32, 40, and 64 cores successfully. This version also allows to compile my prerequisites libraries, so we can use it out of the box to

Re: [OMPI users] memory per core/process

2013-03-30 Thread Patrick Bégou
ory he has to request 2 cores, even if he uses a sequential code. This avoid crashing jobs of other users on the same node with memory requirements. But this is not configured on your node. Duke Nguyen a écrit : On 3/30/13 3:13 PM, Patrick Bégou wrote: I do not know about your code but: 1) did

Re: [OMPI users] memory per core/process

2013-03-30 Thread Patrick Bégou
I do not know about your code but: 1) did you check stack limitations ? Typically intel fortran codes needs large amount of stack when the problem size increase. Check ulimit -a 2) did your node uses cpuset and memory limitation like fake numa to set the maximum amount of memory available