Re: [OMPI users] openib/mpi_alloc_mem pathology [#20160912-1315]

2017-10-20 Thread Paul Kapinos
On 10/20/2017 12:24 PM, Dave Love wrote: > Paul Kapinos writes: > >> Hi all, >> sorry for the long long latency - this message was buried in my mailbox for >> months >> >> >> >> On 03/16/2017 10:35 AM, Alfio Lazzaro wrote: >>> Hello Dave and others, >>> we jump in the discussion as CP2K devel

Re: [OMPI users] openib/mpi_alloc_mem pathology [#20160912-1315]

2017-10-20 Thread Dave Love
Paul Kapinos writes: > Hi all, > sorry for the long long latency - this message was buried in my mailbox for > months > > > > On 03/16/2017 10:35 AM, Alfio Lazzaro wrote: >> Hello Dave and others, >> we jump in the discussion as CP2K developers. >> We would like to ask you which version of CP

Re: [OMPI users] openib/mpi_alloc_mem pathology [#20160912-1315]

2017-10-19 Thread Paul Kapinos
Hi all, sorry for the long long latency - this message was buried in my mailbox for months On 03/16/2017 10:35 AM, Alfio Lazzaro wrote: > Hello Dave and others, > we jump in the discussion as CP2K developers. > We would like to ask you which version of CP2K you are using in your tests versi

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-21 Thread Dave Love
I wrote: > But it works OK with libfabric (ofi mtl). Is there a problem with > libfabric? Apparently there is, or at least with ompi 1.10. I've now realized IMB pingpong latency on a QDR IB system with ompi 1.10.6+libfabric is ~2.5μs, which it isn't with ompi 1.6 openib. __

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-16 Thread Jeff Squyres (jsquyres)
Ok. I talked with Nathan about this a bit. Here's what we think we should do: 1. Add an MCA param to disable (de)registration as part of ALLOC/FREE_MEM. Because that's just the Open MPI way (moar MCA paramz!). 2. If memory hooks are enabled, default to not (de)registering as part of ALLOC/FR

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-16 Thread Paul Kapinos
Jeff, I confirm: your patch did it. (tried on 1.10.6 - do not even need to rebuild the cp2k.popt , just load another Open MPI version compiled with Jeff'path) ( On Intel OmpiPath the same speed as with --mca btl ^tcp,openib ) On 03/16/17 01:03, Jeff Squyres (jsquyres) wrote: It looks like t

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-16 Thread Jingchao Zhang
Jingchao From: users on behalf of Jeff Squyres (jsquyres) Sent: Thursday, March 16, 2017 8:46:30 AM To: Open MPI User's List Subject: Re: [OMPI users] openib/mpi_alloc_mem pathology On Mar 16, 2017, at 10:37 AM, Jingchao Zhang wrote: > > One of my earlier replies includes the ba

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-16 Thread Jeff Squyres (jsquyres)
On Mar 16, 2017, at 10:37 AM, Jingchao Zhang wrote: > > One of my earlier replies includes the backtraces of cp2k.popt process and > the problem points to MPI_ALLOC_MEM/MPI_FREE_MEM. > https://mail-archive.com/users@lists.open-mpi.org/msg30587.html Yep -- saw it. That -- paired with the profil

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-16 Thread Jingchao Zhang
backtrace? --Jingchao From: users on behalf of Jeff Squyres (jsquyres) Sent: Wednesday, March 15, 2017 6:42:44 PM To: Open MPI User's List Subject: Re: [OMPI users] openib/mpi_alloc_mem pathology On Mar 15, 2017, at 8:25 PM, Jeff Hammond wrote: > > I co

Re: [OMPI users] openib/mpi_alloc_mem pathology [#20160912-1315]

2017-03-16 Thread Paul Kapinos
Hi, On 03/16/17 10:35, Alfio Lazzaro wrote: We would like to ask you which version of CP2K you are using in your tests Release 4.1 and if you can share with us your input file and output log. The question goes to Mr Mathias Schumacher, on CC: Best Paul Kapinos (Our internal ticketing sys

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-16 Thread Alfio Lazzaro
Hello Dave and others, we jump in the discussion as CP2K developers. We would like to ask you which version of CP2K you are using in your tests and if you can share with us your input file and output log. Some clarifications on the way we use MPI allocate/free: 1) only buffers used for MPI communi

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-15 Thread Jeff Hammond
On Wed, Mar 15, 2017 at 5:44 PM Jeff Squyres (jsquyres) wrote: > On Mar 15, 2017, at 8:25 PM, Jeff Hammond wrote: > > > > I couldn't find the docs on mpool_hints, but shouldn't there be a way to > disable registration via MPI_Info rather than patching the source? > > Yes; that's what I was think

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-15 Thread Jeff Squyres (jsquyres)
On Mar 15, 2017, at 8:25 PM, Jeff Hammond wrote: > > I couldn't find the docs on mpool_hints, but shouldn't there be a way to > disable registration via MPI_Info rather than patching the source? Yes; that's what I was thinking, but wanted to get the data point first. Specifically: if this tes

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-15 Thread Jeff Hammond
I couldn't find the docs on mpool_hints, but shouldn't there be a way to disable registration via MPI_Info rather than patching the source? Jeff PS Jeff Squyres: ;-) ;-) ;-) On Wed, Mar 15, 2017 at 5:03 PM, Jeff Squyres (jsquyres) wrote: > > It looks like there were 3 separate threads on this C

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-15 Thread Jeff Squyres (jsquyres)
It looks like there were 3 separate threads on this CP2K issue, but I think we developers got sidetracked because there was a bunch of talk in the other threads about PSM, non-IB(verbs) networks, etc. So: the real issue is an app is experiencing a lot of slowdown when calling MPI_ALLOC_MEM/MPI_

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-15 Thread Dave Love
Paul Kapinos writes: > Nathan, > unfortunately '--mca memory_linux_disable 1' does not help on this > issue - it does not change the behaviour at all. > Note that the pathological behaviour is present in Open MPI 2.0.2 as > well as in /1.10.x, and Intel OmniPath (OPA) network-capable nodes are >

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-13 Thread Paul Kapinos
Nathan, unfortunately '--mca memory_linux_disable 1' does not help on this issue - it does not change the behaviour at all. Note that the pathological behaviour is present in Open MPI 2.0.2 as well as in /1.10.x, and Intel OmniPath (OPA) network-capable nodes are affected only. The known work

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-09 Thread Dave Love
Nathan Hjelm writes: > If this is with 1.10.x or older run with --mca memory_linux_disable > 1. There is a bad interaction between ptmalloc2 and psm2 support. This > problem is not present in v2.0.x and newer. Is that applicable to openib too? ___ user

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-09 Thread Dave Love
Paul Kapinos writes: > Hi Dave, > > > On 03/06/17 18:09, Dave Love wrote: >> I've been looking at a new version of an application (cp2k, for for what >> it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't > > Welcome to the club! :o) > In our measures we see some 70% of time in '

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-07 Thread Nathan Hjelm
If this is with 1.10.x or older run with --mca memory_linux_disable 1. There is a bad interaction between ptmalloc2 and psm2 support. This problem is not present in v2.0.x and newer. -Nathan > On Mar 7, 2017, at 10:30 AM, Paul Kapinos wrote: > > Hi Dave, > > >> On 03/06/17 18:09, Dave Love

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-07 Thread Paul Kapinos
Hi Dave, On 03/06/17 18:09, Dave Love wrote: I've been looking at a new version of an application (cp2k, for for what it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't Welcome to the club! :o) In our measures we see some 70% of time in 'mpi_free_mem'... and 15x performance