Re: [OMPI users] ucx configuration

2023-01-11 Thread Dave Love via users
Gilles Gouaillardet via users writes: > Dave, > > If there is a bug you would like to report, please open an issue at > https://github.com/open-mpi/ompi/issues and provide all the required > information > (in this case, it should also include the UCX library you are using and how > it was

[OMPI users] ucx configuration

2023-01-05 Thread Dave Love via users
I see assorted problems with OMPI 4.1 on IB, including failing many of the mpich tests (non-mpich-specific ones) particularly with RMA. Now I wonder if UCX build options could have anything to do with it, but I haven't found any relevant information. What configure options would be recommended

Re: [OMPI users] vectorized reductions

2021-07-20 Thread Dave Love via users
Gilles Gouaillardet via users writes: > One motivation is packaging: a single Open MPI implementation has to be > built, that can run on older x86 processors (supporting only SSE) and the > latest ones (supporting AVX512). I take dispatch on micro-architecture for granted, but it doesn't

[OMPI users] vectorized reductions

2021-07-19 Thread Dave Love via users
I meant to ask a while ago about vectorized reductions after I saw a paper that I can't now find. I didn't understand what was behind it. Can someone explain why you need to hand-code the avx implementations of the reduction operations now used on x86_64? As far as I remember, the paper didn't

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-18 Thread Dave Love via users
"Gabriel, Edgar via users" writes: >> How should we know that's expected to fail? It at least shouldn't fail like >> that; set_atomicity doesn't return an error (which the test is prepared for >> on a filesystem like pvfs2). >> I assume doing nothing, but appearing to, can lead to corrupt

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-15 Thread Dave Love via users
"Gabriel, Edgar via users" writes: > I will have a look at those tests. The recent fixes were not > correctness, but performance fixes. > Nevertheless, we used to pass the mpich tests, but I admit that it is > not a testsuite that we run regularly, I will have a look at them. The > atomicity

Re: [OMPI users] bad defaults with ucx

2021-01-14 Thread Dave Love via users
"Jeff Squyres (jsquyres)" writes: > Good question. I've filed > https://github.com/open-mpi/ompi/issues/8379 so that we can track > this. For the benefit of the list: I mis-remembered that osc=ucx was general advice. The UCX docs just say you need to avoid the uct btl, which can cause memory

[OMPI users] bad defaults with ucx

2021-01-14 Thread Dave Love via users
Why does 4.1 still not use the right defaults with UCX? Without specifying osc=ucx, IMB-RMA crashes like 4.0.5. I haven't checked what else it is UCX says you must set for openmpi to avoid memory corruption, at least, but I guess that won't be right either. Users surely shouldn't have to explore

[OMPI users] 4.1 mpi-io test failures on lustre

2021-01-14 Thread Dave Love via users
I tried mpi-io tests from mpich 4.3 with openmpi 4.1 on the ac922 system that I understand was used to fix ompio problems on lustre. I'm puzzled that I still see failures. I don't know why there are disjoint sets in mpich's test/mpi/io and src/mpi/romio/test, but I ran all the non-Fortran ones

Re: [OMPI users] [EXTERNAL] RMA breakage

2020-12-11 Thread Dave Love via users
"Pritchard Jr., Howard" writes: > Hello Dave, > > There's an issue opened about this - > > https://github.com/open-mpi/ompi/issues/8252 Thanks. I don't know why I didn't find that, unless I searched before it appeared. Obviously I was wrong to think it didn't look system-specific without time

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-12-07 Thread Dave Love via users
Ralph Castain via users writes: > Just a point to consider. OMPI does _not_ want to get in the mode of > modifying imported software packages. That is a blackhole of effort we > simply cannot afford. It's already done that, even in flatten.c. Otherwise updating to the current version would be

[OMPI users] RMA breakage

2020-12-07 Thread Dave Love via users
After seeing several failures with RMA with the change needed to get 4.0.5 through IMB, I looked for simple tests. So, I built the mpich 3.4b1 tests -- or the ones that would build, and I haven't checked why some fail -- and ran the rma set. Three out of 180 passed. Many (most?) aborted in ucx,

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-12-02 Thread Dave Love via users
Mark Allen via users writes: > At least for the topic of why romio fails with HDF5, I believe this is the > fix we need (has to do with how romio processes the MPI datatypes in its > flatten routine). I made a different fix a long time ago in SMPI for that, > then somewhat more recently it was

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-30 Thread Dave Love via users
As a check of mpiP, I ran HDF5 testpar/t_bigio under it. This was on one node with four ranks (interactively) on lustre with its default of one 1MB stripe, ompi-4.0.5 + ucx-1.9, hdf5-1.10.7, MCA defaults. I don't know how useful it is, but here's the summary: romio: @--- Aggregate Time (top

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-27 Thread Dave Love via users
Mark Dixon via users writes: > But remember that IMB-IO doesn't cover everything. I don't know what useful operations it omits, but it was the obvious thing to run, that should show up pathology, with simple things first. It does at least run, which was the first concern. > For example, hdf5's

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-25 Thread Dave Love via users
I wrote: > The perf test says romio performs a bit better. Also -- from overall > time -- it's faster on IMB-IO (which I haven't looked at in detail, and > ran with suboptimal striping). I take that back. I can't reproduce a significant difference for total IMB-IO runtime, with both run in

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-23 Thread Dave Love via users
Mark Dixon via users writes: > Surely I cannot be the only one who cares about using a recent openmpi > with hdf5 on lustre? I generally have similar concerns. I dug out the romio tests, assuming something more basic is useful. I ran them with ompi 4.0.5+ucx on Mark's lustre system (similar

[OMPI users] experience on POWER?

2020-10-24 Thread Dave Love via users
Can anyone report experience with recent OMPI on POWER (ppc64le) hardware, e.g. Summit? When I tried on similar nodes to Summit's (but fewer!), the IMB-RMA benchmark SEGVs early on. Before I try to debug it, I'd be interested to know if anyone else has investigated that or had better luck and,

Re: [OMPI users] relocating an installation

2019-04-10 Thread Dave Love
In fact, setting OPAL_PREFIX doesn't work for a relocated tree (with OMPI 1.10 or 3.0). You also need $OPAL_PREFIX/lib and $OPAL_PREFIX/lib/openmpi on LD_LIBRARY_PATH (assuming $MPI_LIB=$MPI_HOME/lib): $ OPAL_PREFIX=$(pwd)/usr/lib64/openmpi3 ./usr/lib64/openmpi3/bin/mpirun mpirun true

Re: [OMPI users] relocating an installation

2019-04-10 Thread Dave Love
"Jeff Squyres (jsquyres) via users" writes: > Reuti's right. > > Sorry about the potentially misleading use of "--prefix" -- we > basically inherited that CLI option from a different MPI > implementation (i.e., people asked for it). So we were locked into > that meaning for the "--prefix" CLI

Re: [OMPI users] relocating an installation

2019-04-10 Thread Dave Love
Reuti writes: >> It should be documented. > > There is this FAQ entry: > > https://www.open-mpi.org/faq/?category=building#installdirs For what it's worth, I looked under "running" in the FAQ, as I was after a runtime switch. I expect FAQs to point to the actual documentation, though, and an

Re: [OMPI users] relocating an installation

2019-04-09 Thread Dave Love
Reuti writes: > export OPAL_PREFIX= > > to point it to the new location of installation before you start `mpiexec`. Thanks; that's now familiar, and I don't know how I missed it with strings. It should be documented. I'd have expected --prefix to have the same effect, and for there to be an

[OMPI users] relocating an installation

2019-04-09 Thread Dave Love
Is it possible to use the environment or mpirun flags to run an OMPI that's been relocated from where it was configured/installed? (Say you've unpacked a system package that expects to be under /usr and want to run it from home without containers etc.) I thought that was possible, but I haven't

Re: [OMPI users] filesystem-dependent failure building Fortran interfaces

2018-12-11 Thread Dave Love
Jeff Hammond writes: > Preprocessor is fine in Fortran compilers. We’ve used in NWChem for many > years, and NWChem supports “all the compilers”. > > Caveats: > - Cray dislikes recursive preprocessing logic that other compilers handle. > You won’t use this so please ignore. > - IBM XLF requires

Re: [OMPI users] filesystem-dependent failure building Fortran interfaces

2018-12-05 Thread Dave Love
"Jeff Squyres (jsquyres) via users" writes: > Hi Dave; thanks for reporting. > > Yes, we've fixed this -- it should be included in 4.0.1. > > https://github.com/open-mpi/ompi/pull/6121 Good, but I'm confused; I checked the repo before reporting it. [I wince at processing Fortran with cpp,

[OMPI users] filesystem-dependent failure building Fortran interfaces

2018-12-04 Thread Dave Love
If you try to build somewhere out of tree, not in a subdir of the source, the Fortran build is likely to fail because mpi-ext-module.F90 does include '/openmpi-4.0.0/ompi/mpiext/pcollreq/mpif-h/mpiext_pcollreq_mpifh.h' and can exceed the fixed line length. It either needs to add (the

Re: [OMPI users] ompio on Lustre

2018-10-16 Thread Dave Love
"Gabriel, Edgar" writes: > a) if we detect a Lustre file system without flock support, we can > printout an error message. Completely disabling MPI I/O is on the > ompio architecture not possible at the moment, since the Lustre > component can disqualify itself, but the generic Unix FS component

Re: [OMPI users] ompio on Lustre

2018-10-16 Thread Dave Love
"Latham, Robert J." writes: > it's hard to implement fcntl-lock-free versions of Atomic mode and > Shared file pointer so file systems like PVFS don't support those modes > (and return an error indicating such at open time). Ah. For some reason I thought PVFS had the support to pass the tests

Re: [OMPI users] ompio on Lustre

2018-10-15 Thread Dave Love
For what it's worth, I found the following from running ROMIO's tests with OMPIO on Lustre mounted without flock (or localflock). I used 48 processes on two nodes with Lustre for tests which don't require a specific number. OMPIO fails tests atomicity, misc, and error on ext4; it additionally

Re: [OMPI users] no openmpi over IB on new CentOS 7 system

2018-10-10 Thread Dave Love
RDMA was just broken in the last-but-one(?) RHEL7 kernel release, in case that's the problem. (Fixed in 3.10.0-862.14.4.) ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] ompio on Lustre

2018-10-10 Thread Dave Love
"Gabriel, Edgar" writes: > Ok, thanks. I usually run these test with 4 or 8, but the major item > is that atomicity is one of the areas that are not well supported in > ompio (along with data representations), so a failure in those tests > is not entirely surprising . If it's not expected to

Re: [OMPI users] ompio on Lustre

2018-10-09 Thread Dave Love
"Gabriel, Edgar" writes: > Hm, thanks for the report, I will look into this. I did not run the > romio tests, but the hdf5 tests are run regularly and with 3.1.2 you > should not have any problems on a regular unix fs. How many processes > did you use, and which tests did you run specifically?

Re: [OMPI users] ompio on Lustre

2018-10-08 Thread Dave Love
I said I'd report back about trying ompio on lustre mounted without flock. I couldn't immediately figure out how to run MTT. I tried the parallel hdf5 tests from the hdf5 1.10.3, but I got errors with that even with the relevant environment variable to put the files on (local) /tmp. Then it

Re: [OMPI users] ompio on Lustre

2018-10-05 Thread Dave Love
"Gabriel, Edgar" writes: > It was originally for performance reasons, but this should be fixed at > this point. I am not aware of correctness problems. > > However, let me try to clarify your question about: What do you > precisely mean by "MPI I/O on Lustre mounts without flock"? Was the >

[OMPI users] ompio on Lustre

2018-10-05 Thread Dave Love
Is romio preferred over ompio on Lustre for performance or correctness? If it's relevant, the context is MPI-IO on Lustre mounts without flock, which ompio doesn't seem to require. Thanks. ___ users mailing list users@lists.open-mpi.org

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-29 Thread Dave Love
Kaiming Ouyang writes: > Hi Jeff, > Thank you for your advice. I will contact the author for some suggestions. > I also notice I may port this old version library to new openmpi 3.0. I > will work on this soon. Thank you. I haven't used them, but at least the profiling part,

Re: [OMPI users] openib/mpi_alloc_mem pathology [#20160912-1315]

2017-10-20 Thread Dave Love
Paul Kapinos writes: > Hi all, > sorry for the long long latency - this message was buried in my mailbox for > months > > > > On 03/16/2017 10:35 AM, Alfio Lazzaro wrote: >> Hello Dave and others, >> we jump in the discussion as CP2K developers. >> We would like

Re: [OMPI users] Question concerning compatibility of languages used with building OpenMPI and languages OpenMPI uses to build MPI binaries.

2017-09-20 Thread Dave Love
Jeff Hammond writes: > Intel compilers support GOMP runtime interoperability, although I don't > believe it is the default. You can use the Intel/LLVM OpenMP runtime with > GCC such that all three OpenMP compilers work together. For what it's worth, it's trivial to make

Re: [OMPI users] Question concerning compatibility of languages used with building OpenMPI and languages OpenMPI uses to build MPI binaries.

2017-09-20 Thread Dave Love
Jeff Hammond writes: > Please separate C and C++ here. C has a standard ABI. C++ doesn't. > > Jeff [For some value of "standard".] I've said the same about C++, but the current GCC manual says its C++ ABI is "industry standard", and at least Intel document

Re: [OMPI users] built-in memchecker support

2017-08-24 Thread Dave Love
Gilles Gouaillardet writes: > Dave, > > the builtin memchecker can detect MPI usage errors such as modifying > the buffer passed to MPI_Isend() before the request completes OK, thanks. The implementation looks rather different, and it's not clear without checking

[OMPI users] built-in memchecker support

2017-08-24 Thread Dave Love
Apropos configuration parameters for packaging: Is there a significant benefit to configuring built-in memchecker support, rather than using the valgrind preload library? I doubt being able to use another PMPI tool directly at the same time counts. Also, are there measurements of the

Re: [OMPI users] Questions about integration with resource distribution systems

2017-08-02 Thread Dave Love
Reuti writes: >> I should qualify that by noting that ENABLE_ADDGRP_KILL has apparently >> never propagated through remote startup, > > Isn't it a setting inside SGE which the sge_execd is aware of? I never > exported any environment variable for this purpose. Yes,

Re: [OMPI users] --enable-builtin-atomics

2017-08-02 Thread Dave Love
"Barrett, Brian via users" writes: > Well, if you’re trying to get Open MPI running on a platform for which > we don’t have atomics support, built-in atomics solves a problem for > you… That's not an issue in this case, I think. (I'd expect it to default to intrinsic

Re: [OMPI users] --enable-builtin-atomics

2017-08-02 Thread Dave Love
Nathan Hjelm writes: > So far only cons. The gcc and sync builtin atomic provide slower > performance on x86-64 (and possible other platforms). I plan to > investigate this as part of the investigation into requiring C11 > atomics from the C compiler. Thanks. Is that a gcc

Re: [OMPI users] Questions about integration with resource distribution systems

2017-08-01 Thread Dave Love
Gilles Gouaillardet writes: > Dave, > > > unless you are doing direct launch (for example, use 'srun' instead of > 'mpirun' under SLURM), > > this is the way Open MPI is working : mpirun will use whatever the > resource manager provides > > in order to spawn the remote orted

[OMPI users] --enable-builtin-atomics

2017-08-01 Thread Dave Love
What are the pros and cons of configuring with --enable-builtin-atomics? I haven't spotted any discussion of the option. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] absolute paths printed by info programs

2017-08-01 Thread Dave Love
ompi_info et al print absolute compiler paths for some reason. What would they ever be used for, and are they intended to refer to the OMPI build or application building? They're an issue for packaging in Guix, at least. Similarly, what's io_romio_complete_configure_params intended to be used

Re: [OMPI users] NUMA interaction with Open MPI

2017-07-27 Thread Dave Love
Gilles Gouaillardet writes: > Adam, > > keep in mind that by default, recent Open MPI bind MPI tasks > - to cores if -np 2 > - to NUMA domain otherwise Not according to ompi_info from the latest release; it says socket. > (which is a socket in most cases, unless

Re: [OMPI users] Questions about integration with resource distribution systems

2017-07-27 Thread Dave Love
"r...@open-mpi.org" writes: > Oh no, that's not right. Mpirun launches daemons using qrsh and those > daemons spawn the app's procs. SGE has no visibility of the app at all Oh no, that's not right. The whole point of tight integration with remote startup using qrsh is to

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-21 Thread Dave Love
I wrote: > But it works OK with libfabric (ofi mtl). Is there a problem with > libfabric? Apparently there is, or at least with ompi 1.10. I've now realized IMB pingpong latency on a QDR IB system with ompi 1.10.6+libfabric is ~2.5μs, which it isn't with ompi 1.6 openib.

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-15 Thread Dave Love
Paul Kapinos writes: > Nathan, > unfortunately '--mca memory_linux_disable 1' does not help on this > issue - it does not change the behaviour at all. > Note that the pathological behaviour is present in Open MPI 2.0.2 as > well as in /1.10.x, and Intel OmniPath

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-09 Thread Dave Love
Nathan Hjelm writes: > If this is with 1.10.x or older run with --mca memory_linux_disable > 1. There is a bad interaction between ptmalloc2 and psm2 support. This > problem is not present in v2.0.x and newer. Is that applicable to openib too?

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-09 Thread Dave Love
Paul Kapinos <kapi...@itc.rwth-aachen.de> writes: > Hi Dave, > > > On 03/06/17 18:09, Dave Love wrote: >> I've been looking at a new version of an application (cp2k, for for what >> it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't > > Welcom

[OMPI users] openib/mpi_alloc_mem pathology

2017-03-06 Thread Dave Love
I've been looking at a new version of an application (cp2k, for for what it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't think it did so the previous version I looked at. I found on an IB-based system it's spending about half its time in those allocation routines (according

Re: [OMPI users] How to yield CPU more when not computing (was curious behavior during wait for broadcast: 100% cpu)

2016-12-12 Thread Dave Love
Andreas Schäfer writes: >> Yes, as root, and there are N different systems to at least provide >> unprivileged read access on HPC systems, but that's a bit different, I >> think. > > LIKWID[1] uses a daemon to provide limited RW access to MSRs for > applications. I wouldn't

[OMPI users] MPI+OpenMP core binding redux

2016-12-08 Thread Dave Love
I think there was a suggestion that the SC16 material would explain how to get appropriate core binding for MPI+OpenMP (i.e. OMP_NUM_THREADS cores/process), but it doesn't as far as I can see. Could someone please say how you're supposed to do that in recent versions (without relying on bound DRM

Re: [OMPI users] How to yield CPU more when not computing (was curious behavior during wait for broadcast: 100% cpu)

2016-12-08 Thread Dave Love
Jeff Hammond writes: >> >> >> > Note that MPI implementations may be interested in taking advantage of >> > https://software.intel.com/en-us/blogs/2016/10/06/intel- >> xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait. >> >> Is that really useful if it's

Re: [OMPI users] An old code compatibility

2016-11-15 Thread Dave Love
Mahmood Naderan writes: > Hi, > The following mpifort command fails with a syntax error. It seems that the > code is compatible with old gfortran, but I am not aware of that. Any idea > about that? > > mpifort -ffree-form -ffree-line-length-0 -ff2c -fno-second-underscore >

Re: [OMPI users] How to yield CPU more when not computing (was curious behavior during wait for broadcast: 100% cpu)

2016-11-09 Thread Dave Love
Jeff Hammond writes: >> I see sleeping for ‘0s’ typically taking ≳50μs on Linux (measured on >> RHEL 6 or 7, without specific tuning, on recent Intel). It doesn't look >> like something you want in paths that should be low latency, but maybe >> there's something you can

Re: [OMPI users] mpi4py+OpenMPI: Qs about submitting bugs and examples

2016-11-07 Thread Dave Love
"r...@open-mpi.org" writes: >> Is this mailing list a good spot to submit bugs for OpenMPI? Or do I >> use github? > > You can use either - I would encourage the use of github “issues” when > you have a specific bug, and the mailing list for general questions I was told not

Re: [OMPI users] Redusing libmpi.so size....

2016-11-07 Thread Dave Love
Mahesh Nanavalla writes: > Hi all, > > I am using openmpi-1.10.3. > > openmpi-1.10.3 compiled for arm(cross compiled on X86_64 for openWRT > linux) libmpi.so.12.0.3 size is 2.4MB,but if i compiled on X86_64 (linux) > libmpi.so.12.0.3 size is 990.2KB. > > can

Re: [OMPI users] what was the rationale behind rank mapping by socket?

2016-11-07 Thread Dave Love
"r...@open-mpi.org" writes: > Yes, I’ve been hearing a growing number of complaints about cgroups for that > reason. Our mapping/ranking/binding options will work with the cgroup > envelope, but it generally winds up with a result that isn’t what the user > wanted or

Re: [OMPI users] How to yield CPU more when not computing (was curious behavior during wait for broadcast: 100% cpu)

2016-11-07 Thread Dave Love
[Some time ago] Jeff Hammond writes: > If you want to keep long-waiting MPI processes from clogging your CPU > pipeline and heating up your machines, you can turn blocking MPI > collectives into nicer ones by implementing them in terms of MPI-3 > nonblocking collectives

Re: [OMPI users] Using Open MPI with multiple versions of GCC and G++

2016-10-11 Thread Dave Love
"Jeff Squyres (jsquyres)" writes: > Especially with C++, the Open MPI team strongly recommends you > building Open MPI with the target versions of the compilers that you > want to use. Unexpected things can happen when you start mixing > versions of compilers (particularly

Re: [OMPI users] Launching hybrid MPI/OpenMP jobs on a cluster: correct OpenMPI flags?

2016-10-11 Thread Dave Love
Wirawan Purwanto writes: > Instead of the scenario above, I was trying to get the MPI processes > side-by-side (more like "fill_up" policy in SGE scheduler), i.e. fill > node 0 first, then fill node 1, and so on. How do I do this properly? > > I tried a few attempts that

Re: [OMPI users] what was the rationale behind rank mapping by socket?

2016-10-11 Thread Dave Love
Gilles Gouaillardet writes: > Bennet, > > > my guess is mapping/binding to sockets was deemed the best compromise > from an > > "out of the box" performance point of view. > > > iirc, we did fix some bugs that occured when running under asymmetric > cpusets/cgroups. > > if you

Re: [hwloc-users] memory binding on Knights Landing

2016-10-11 Thread Dave Love
Brice Goglin writes: > I ran more benchmarks. What's really slow is the reading of all sysfs > files. About 90% of the topology building time is spent there on KNL. > We're reading more than 7000 files (most of them are 6 files for each > hardware thread and 6 files for

Re: [OMPI users] Compilation without NVML support

2016-09-20 Thread Dave Love
Brice Goglin writes: > Hello > Assuming this NVML detection is actually done by hwloc, I guess there's > nothing in OMPI to disable it. It's not the first time we get such an > issue with OMPI not having all hwloc's --disable-foo options, but I > don't think we actually

Re: [OMPI users] MPI libraries

2016-09-13 Thread Dave Love
I wrote: > Gilles Gouaillardet writes: > >> Mahmood, >> >> mpi_siesta is a siesta library, not an Open MPI library. >> >> fwiw, you might want to try again from scratch with >> MPI_INTERFACE=libmpi_f90.a >> DEFS_MPI=-DMPI >> in your arch.make >> >> i do not think

Re: [OMPI users] MPI libraries

2016-09-12 Thread Dave Love
Gilles Gouaillardet writes: > Mahmood, > > mpi_siesta is a siesta library, not an Open MPI library. > > fwiw, you might want to try again from scratch with > MPI_INTERFACE=libmpi_f90.a > DEFS_MPI=-DMPI > in your arch.make > > i do not think libmpi_f90.a is related

Re: [hwloc-users] memory binding on Knights Landing

2016-09-12 Thread Dave Love
Brice Goglin writes: > So what's really slow is reading sysfs and/or inserting all hwloc > objects in the tree. I need to do some profiling. And I am moving the > item "parallelize the discovery" higher in the TODO list :) It didn't seem to scale between systems with the

Re: [hwloc-users] memory binding on Knights Landing

2016-09-09 Thread Dave Love
Brice Goglin writes: > Is there anything to fix on the RPM side? Nothing significant, I think. The update Fedora version needed slight adjustment for hwloc-dump-hwdata, at least. > Intel people are carrefully > working with RedHat so that hwloc is properly packaged for

Re: [hwloc-users] memory binding on Knights Landing

2016-09-09 Thread Dave Love
Jeff Hammond writes: >> By the way, is it expected that binding will be slow on it? hwloc-bind >> is ~10 times slower (~1s) than on two-socket sandybridge, and ~3 times >> slower than on a 128-core, 16-socket system. >> >> Is this a bottleneck in any application? Are

[OMPI users] mpi4py/fc20 (was: users Digest, Vol 3592, Issue 1)

2016-09-01 Thread Dave Love
"Mahdi, Sam" writes: > To dave, from the installation guide I found, it seemed I couldnt just > directly download it from the package list, but rather Id need to use the > mpicc wrapper to compile and install. That makes no sense to a maintainer of some openmpi Fedora

Re: [OMPI users] Certain files for mpi missing when building mpi4py

2016-08-31 Thread Dave Love
"Mahdi, Sam" writes: > HI everyone, > > I am using a linux fedora. I downloaded/installed > openmpi-1.7.3-1.fc20(64-bit) and openmpi-devel-1.7.3-1.fc20(64-bit). As > well as pypar-openmpi-2.1.5_108-3.fc20(64-bit) and > python3-mpi4py-openmpi-1.3.1-1.fc20(64-bit). The

Re: [OMPI users] An equivalent to btl_openib_include_if when MXM over Infiniband ?

2016-08-18 Thread Dave Love
"Audet, Martin" writes: > Hi Josh, > > Thanks for your reply. I did try setting MXM_RDMA_PORTS=mlx4_0:1 for all my > MPI processes > and it did improve performance but the performance I obtain isn't completely > satisfying. I raised the issue of MXM hurting p2p

Re: [OMPI users] SGE integration broken in 2.0.0

2016-08-18 Thread Dave Love
"Jeff Squyres (jsquyres)" writes: > On Aug 16, 2016, at 3:07 PM, Reuti wrote: >> >> Thx a bunch - that was it. Despite searching for a solution I found >> only hints that didn't solve the issue. > > FWIW, we talk about this in the HACKING file,

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-22 Thread Dave Love
Rob Nagler writes: > Thanks, John. I sometimes wonder if I'm the only one out there with this > particular problem. > > Ralph, thanks for sticking with me. :) Using a pool of uids doesn't really > work due to the way cgroups/containers works. It also would require >

Re: [OMPI users] Big jump from OFED 1.5.4.1 -> recent (stable). Any suggestions?

2016-06-22 Thread Dave Love
"Llolsten Kaonga" writes: > Hello Grigory, > > I am not sure what Redhat does exactly but when you install the OS, there is > always an InfiniBand Support module during the installation process. We > never check/install that module when we do OS installations because it is >

[OMPI users] 2.0 documentation

2016-06-22 Thread Dave Love
I know it's not traditional, but is there any chance of complete documentation of the important changes in v2.0? Currently NEWS mentions things like minor build issues, but there's nothing, for instance, on the addition and removal of whole frameworks, one of which I've been trying to understand.

Re: [OMPI users] users Digest, Vol 3510, Issue 2

2016-05-25 Thread Dave Love
I wrote: > You could wrap one (set of) program(s) in a script to set the > appropriate environment before invoking the real program. I realize I should have said something like "program invocations", i.e. if you have no control over something invoking mpirun for programs using different MPIs,

Re: [OMPI users] users Digest, Vol 3510, Issue 2

2016-05-24 Thread Dave Love
Megdich Islem writes: > Yes, Empire does the fluid structure coupling. It couples OpenFoam (fluid > analysis) and Abaqus (structural analysis). > Does all the software need to have the same MPI architecture in order to > communicate ? I doubt it's doing that, and

Re: [OMPI users] wtime implementation in 1.10

2016-05-24 Thread Dave Love
ition to review backports or even put things in a bug tracker. 1.10 isn't used here, and I just subvert gettimeofday whenever I'm running something that might use it for timing short intervals. > I’ll create the PR and copy you for review > > >> On May 23, 2016, at 9:17 AM, Dave Love

[OMPI users] wtime implementation in 1.10

2016-05-23 Thread Dave Love
I thought the 1.10 branch had been fixed to use clock_gettime for MPI_Wtime where it's available, a la https://www.open-mpi.org/community/lists/users/2016/04/28899.php -- and have been telling people so! However, I realize it hasn't, and it looks as if 1.10 is still being maintained. Is there a

Re: [OMPI users] OpenMPI 1.6.5 on CentOS 7.1, silence ib-locked-pages?

2016-05-20 Thread Dave Love
Ryan Novosielski writes: > I’m pretty sure this is no longer relevant (having read Roland’s > messages about it from a couple of years ago now). Can you please > confirm that for me, and then let me know if there is any way that I > can silence this old copy of OpenMPI that

Re: [OMPI users] Building vs packaging

2016-05-20 Thread Dave Love
dani writes: > I don't know about .deb packages, but at least in the rpms there is a > post install scriptlet that re-runs ldconfig to ensure the new libs > are in the ldconfig cache. MPI packages following the Fedora guidelines don't do that (and rpmlint complains bitterly

Re: [OMPI users] Question about mpirun mca_oob_tcp_recv_handler error.

2016-05-16 Thread Dave Love
Ralph Castain writes: > This usually indicates that the remote process is using a different OMPI > version. You might check to ensure that the paths on the remote nodes are > correct. That seems quite a common problem with non-obvious failure modes. Is it not possible to

Re: [OMPI users] No core dump in some cases

2016-05-16 Thread Dave Love
Gilles Gouaillardet writes: > Are you sure ulimit -c unlimited is *really* applied on all hosts > > > can you please run the simple program below and confirm that ? Nothing specifically wrong with that, but it's worth installing procenv(1) as a general solution to checking

Re: [OMPI users] Building vs packaging

2016-05-16 Thread Dave Love
"Rob Malpass" writes: > Almost in desperation, I cheated: Why is that cheating? Unless you specifically want a different version, it seems sensible to me, especially as you then have access to packaged versions of at least some MPI programs. Likewise with rpm-based

Re: [OMPI users] barrier algorithm 5

2016-05-06 Thread Dave Love
Gilles Gouaillardet writes: > Dave, > > > i made PR #1644 to abort with a user friendly error message > > https://github.com/open-mpi/ompi/pull/1644 Thanks. Could there be similar cases that might be worth a change?

[OMPI users] SLOAVx alltoallv

2016-05-06 Thread Dave Love
At the risk of banging on too much about collectives: I came across a writeup of the "SLOAVx" algorithm for alltoallv . It was implemented in OMPI with apparently good results, but I can't find any code. I wonder if anyone knows the story on

Re: [OMPI users] barrier algorithm 5

2016-05-04 Thread Dave Love
Gilles Gouaillardet writes: > Dave, > > yes, this is for two MPI tasks only. > > the MPI subroutine could/should return with an error if the communicator is > made of more than 3 tasks. > an other option would be to abort at initialization time if no collective >

[OMPI users] barrier algorithm 5

2016-05-04 Thread Dave Love
With OMPI 1.10.2 and earlier on Infiniband, IMB generally spins with no output for the barrier benchmark if you run it with algorithm 5, i.e. mpirun --mca coll_tuned_use_dynamic_rules 1 --mca coll_tuned_barrier_algorithm 5 IMB-MPI1 barrier This is "two proc only". Does that mean it will only

Re: [OMPI users] Ubuntu and LD_LIBRARY_PATH

2016-05-03 Thread Dave Love
John Hearns writes: > May I ask though - what is the purpose of your cluster? > If you are using Ubunutu, have you looked at Qlustar? > https://www.qlustar.com/ > Might save you a whole lot of heartache! Well, proprietary cluster management systems have only given me

[OMPI users] collective tuning (was: MPI_Bcast implementations in OpenMPI)

2016-05-03 Thread Dave Love
George Bosilca <bosi...@icl.utk.edu> writes: >> On Apr 25, 2016, at 11:33 , Dave Love <d.l...@liverpool.ac.uk> wrote: >> >> George Bosilca <bosi...@icl.utk.edu> writes: >>> I have recently reshuffled the tuned module to move all the algorithms >

Re: [OMPI users] Ubuntu and LD_LIBRARY_PATH

2016-04-26 Thread Dave Love
"Rob Malpass" writes: > Hi > > > > Sorry if this isn't 100% relevant to this list but I'm at my wits end. > > > > After a lot of hacking, I've finally configured openmpi on my Ubuntu > cluster. I had been having awful problems with not being able to find the >

Re: [OMPI users] Porting MPI-3 C-program to Fortran

2016-04-25 Thread Dave Love
Tom Rosmond writes: > Thanks for replying, but the difference between what can be done in C > vs fortran is still my problem. I apologize for my rudimentary > understanding of C, but here is a brief summary: I'm not an expert on this stuff, just cautioning about Fortran

Re: [OMPI users] Porting MPI-3 C-program to Fortran

2016-04-22 Thread Dave Love
Jeff Hammond writes: > MPI uses void** arguments to pass pointer by reference so it can be > updated. In Fortran, you always pass by reference so you don't need > this. I don't know if it's relevant in this case, but that's not generally true (even for Fortran 77, for

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-22 Thread Dave Love
George Bosilca writes: > Matthieu, > > If you are talking about how Open MPI selects between different broadcast > algorithms you might want to read [1]. We have implemented a dozen > different broadcast algorithms and have run a set of tests to measure their > performance.

Re: [OMPI users] resolution of MPI_Wtime

2016-04-11 Thread Dave Love
George Bosilca writes: > MPI_Wtick is not about the precision but about the resolution of the > underlying timer (aka. the best you can hope to get). What's the distinction here? (clock_getres(2) says "resolution (precision)".) My point (like JH's?) is that it doesn't

  1   2   3   >