Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-12-02 Thread Mark Dixon via users
Hi Mark, Thanks so much for this - yes, applying that pull request against ompi 4.0.5 allows hdf5 1.10.7's parallel tests to pass on our Lustre filesystem. I'll certainly be applying it on our local clusters! Best wishes, Mark On Tue, 1 Dec 2020, Mark Allen via users wrote: At least for

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-30 Thread Mark Dixon via users
On Fri, 27 Nov 2020, Dave Love wrote: ... It's less dramatic in the case I ran, but there's clearly something badly wrong which needs profiling. It's probably useful to know how many ranks that's with, and whether it's the default striping. (I assume with default ompio fs parameters.) Hi

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-26 Thread Mark Dixon via users
-Original Message- From: users On Behalf Of Mark Dixon via users Sent: Thursday, November 26, 2020 9:38 AM To: Dave Love via users Cc: Mark Dixon ; Dave Love Subject: Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO? On Wed, 25 Nov 2020, Dave Love via users wrote: The perf test says romio

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-26 Thread Mark Dixon via users
On Wed, 25 Nov 2020, Dave Love via users wrote: The perf test says romio performs a bit better. Also -- from overall time -- it's faster on IMB-IO (which I haven't looked at in detail, and ran with suboptimal striping). I take that back. I can't reproduce a significant difference for total

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-17 Thread Mark Dixon via users
Hi Edgar, Pity, that would have been nice! But thanks for looking. Checking through the ompi github issues, I now realise I logged exactly the same issue over a year ago (completely forgot - I've moved jobs since then), including a script to reproduce the issue on a Lustre system.

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-16 Thread Mark Dixon via users
fix in the Open MPI to ROMIO integration layer sometime in the 4.0 series that fixed a datatype problem, which caused some problems in the HDF5 tests. You might be hitting that problem. Thanks Edgar -Original Message----- From: users On Behalf Of Mark Dixon via users Sent: Monday, November 16, 202

[OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-16 Thread Mark Dixon via users
Hi all, I'm confused about how openmpi supports mpi-io on Lustre these days, and am hoping that someone can help. Back in the openmpi 2.0.0 release notes, it said that OMPIO is the default MPI-IO implementation on everything apart from Lustre, where ROMIO is used. Those release notes are

[OMPI users] OMPI 4.0.1 + PHDF5 1.8.21 tests fail on Lustre

2019-08-05 Thread Mark Dixon via users
Hi, I’ve built parallel HDF5 1.8.21 against OpenMPI 4.0.1 on CentOS 7 and a Lustre 2.12 filesystem using the OS-provided GCC 4.8.5 and am trying to run the testsuite. I’m failing the testphdf5 test: could anyone help, please? I’ve successfully used the same method to pass tests when building

Re: [OMPI users] Failed to register memory (openmpi 2.0.2)

2017-11-13 Thread Mark Dixon
ndpoint with port: 1, lid: 69, msg_type: 100 On Thu, 19 Oct 2017, Mark Dixon wrote: Thanks Ralph, will do. Cheers, Mark On Wed, 18 Oct 2017, r...@open-mpi.org wrote: Put “oob=tcp” in your default MCA param file On Oct 18, 2017, at 9:00 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:

Re: [OMPI users] Failed to register memory (openmpi 2.0.2)

2017-10-19 Thread Mark Dixon
Thanks Ralph, will do. Cheers, Mark On Wed, 18 Oct 2017, r...@open-mpi.org wrote: Put “oob=tcp” in your default MCA param file On Oct 18, 2017, at 9:00 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote: Hi, We're intermittently seeing messages (below) about failing to register

[OMPI users] Failed to register memory (openmpi 2.0.2)

2017-10-18 Thread Mark Dixon
Hi, We're intermittently seeing messages (below) about failing to register memory with openmpi 2.0.2 on centos7 / Mellanox FDR Connect-X 3 and the vanilla IB stack as shipped by centos. We're not using any mlx4_core module tweaks at the moment. On earlier machines we used to set registered

Re: [OMPI users] Is building with "--enable-mpi-thread-multiple" recommended?

2017-03-03 Thread Mark Dixon
On Fri, 3 Mar 2017, Paul Kapinos wrote: ... Note that on 1.10.x series (even on 1.10.6), enabling of MPI_THREAD_MULTIPLE in lead to (silent) shutdown of the InfiniBand fabric for that application => SLOW! 2.x versions (tested: 2.0.1) handle MPI_THREAD_MULTIPLE on InfiniBand the right way up,

Re: [OMPI users] More confusion about --map-by!

2017-02-23 Thread Mark Dixon
ists.open-mpi.org <mailto:users@lists.open-mpi.org> https://rfd.newmexicoconsortium.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users -- --

[OMPI users] More confusion about --map-by!

2017-02-23 Thread Mark Dixon
Hi, I'm still trying to figure out how to express the core binding I want to openmpi 2.x via the --map-by option. Can anyone help, please? I bet I'm being dumb, but it's proving tricky to achieve the following aims (most important first): 1) Maximise memory bandwidth usage (e.g. load

Re: [OMPI users] Is building with "--enable-mpi-thread-multiple" recommended?

2017-02-18 Thread Mark Dixon
On Fri, 17 Feb 2017, r...@open-mpi.org wrote: Depends on the version, but if you are using something in the v2.x range, you should be okay with just one installed version Thanks Ralph. How good is MPI_THREAD_MULTIPLE support these days and how far up the wishlist is it, please? We don't

Re: [OMPI users] "-map-by socket:PE=1" doesn't do what I expect

2017-02-17 Thread Mark Dixon
On Fri, 17 Feb 2017, r...@open-mpi.org wrote: Mark - this is now available in master. Will look at what might be required to bring it to 2.0 Thanks Ralph, To be honest, since you've given me an alternative, there's no rush from my point of view. The logic's embedded in a script and it's

[OMPI users] Is building with "--enable-mpi-thread-multiple" recommended?

2017-02-17 Thread Mark Dixon
Hi, We have some users who would like to try out openmpi MPI_THREAD_MULTIPLE support on our InfiniBand cluster. I am wondering if we should enable it on our production cluster-wide version, or install it as a separate "here be dragons" copy. I seem to recall openmpi folk cautioning that

Re: [OMPI users] "-map-by socket:PE=1" doesn't do what I expect

2017-02-15 Thread Mark Dixon
On Wed, 15 Feb 2017, r...@open-mpi.org wrote: Ah, yes - I know what the problem is. We weren’t expecting a PE value of 1 - the logic is looking expressly for values > 1 as we hadn’t anticipated this use-case. Is it a sensible use-case, or am I crazy? I can make that change. I’m off to a

[OMPI users] "-map-by socket:PE=1" doesn't do what I expect

2017-02-15 Thread Mark Dixon
Hi, When combining OpenMPI 2.0.2 with OpenMP, I'm interested in launching a number of ranks and allocating a number of cores to each rank. Using "-map-by socket:PE=", switching to "-map-by node:PE=" if I want to allocate more than a single socket to a rank, seems to do what I want. Except

Re: [OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-06 Thread Mark Dixon
On Mon, 6 Feb 2017, Mark Dixon wrote: ... Ah-ha! "-mca plm_rsh_agent foo" fixes it! Thanks very much - presumably I can stick that in the system-wide openmpi-mca-params.conf for now. ... Except if I do that, it means running ompi outside of the SGE environment no longer works :(

Re: [OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-06 Thread Mark Dixon
On Fri, 3 Feb 2017, r...@open-mpi.org wrote: I do see a diff between 2.0.1 and 2.0.2 that might have a related impact. The way we handled the MCA param that specifies the launch agent (ssh, rsh, or whatever) was modified, and I don’t think the change is correct. It basically says that we

Re: [OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-03 Thread Mark Dixon
On Fri, 3 Feb 2017, Reuti wrote: ... SGE on its own is not configured to use SSH? (I mean the entries in `qconf -sconf` for rsh_command resp. daemon). ... Nope, everything left as the default: $ qconf -sconf | grep _command qlogin_command builtin rlogin_command

[OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-03 Thread Mark Dixon
Hi, Just tried upgrading from 2.0.1 to 2.0.2 and I'm getting error messages that look like openmpi is using ssh to login to remote nodes instead of qrsh (see below). Has anyone else noticed gridengine integration being broken, or am I being dumb? I built with "./configure

Re: [OMPI users] Startup limited to 128 remote hosts in some situations?

2017-01-24 Thread Mark Dixon
qrsh from one node or another. That of course assumes that qrsh is in the same location on all nodes. I've tested that it is possible to qrsh from the head node of a job to a slave node and then on to another slave node by this method. William On Jan 17, 2017, at 9:37 AM, Mark Dixon <m

[OMPI users] Startup limited to 128 remote hosts in some situations?

2017-01-17 Thread Mark Dixon
Hi, While commissioning a new cluster, I wanted to run HPL across the whole thing using openmpi 2.0.1. I couldn't get it to start on more than 129 hosts under Son of Gridengine (128 remote plus the localhost running the mpirun command). openmpi would sit there, waiting for all the orted's

[OMPI users] change in behaviour 1.6 -> 1.8 under sge

2014-11-03 Thread Mark Dixon
Hi there, We've started looking at moving to the openmpi 1.8 branch from 1.6 on our CentOS6/Son of Grid Engine cluster and noticed an unexpected difference when binding multiple cores to each rank. Has openmpi's definition 'slot' changed between 1.6 and 1.8? It used to mean ranks, but now

Re: [OMPI users] knem/openmpi performance?

2013-07-29 Thread Mark Dixon
-- - Mark Dixon Email: m.c.di...@leeds.ac.uk HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2 9JT, UK -

Re: [OMPI users] knem/openmpi performance?

2013-07-12 Thread Mark Dixon
-- - Mark Dixon Email: m.c.di...@leeds.ac.uk HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2 9JT, UK -

[OMPI users] knem/openmpi performance?

2013-07-12 Thread Mark Dixon
-- - Mark Dixon Email: m.c.di...@leeds.ac.uk HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2 9JT, UK -

Re: [OMPI users] configure: mpi-threads disabled by default

2011-05-05 Thread Mark Dixon
trunk, but seem to require you to at least ask for "--enable-opal-multi-threads". Are we supposed to be able to use MPI_THREAD_FUNNELED by default or not? Best wishes, Mark -- ----- Mark Dixon Email: m.c.di

[OMPI users] configure: mpi-threads disabled by default

2011-05-04 Thread Mark Dixon
by default, please? I'm a bit puzzled, as this default seems in conflict with whole "Law of Least Astonishment" thing. Have I missed some disaster that's going to happen? Thanks, Mark -- ----- Mark Dixon

[OMPI users] MPI-IO Lustre driver update?

2010-11-29 Thread Mark Dixon
it into a release, please? Thanks, Mark -- - Mark Dixon Email: m.c.di...@leeds.ac.uk HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2

Re: [OMPI users] Open MPI unable to find threading support for PGI or Sun Studio

2008-08-01 Thread Mark Dixon
On Tue, 29 Jul 2008, Jeff Squyres wrote: On Jul 29, 2008, at 6:52 AM, Mark Dixon wrote: FWIW: I compile with PGI 7.1.4 regularly on RHEL4U4 and don't see this problem. It would be interesting to see the config.log's from these builds to see the actual details of what went wrong

Re: [OMPI users] Open MPI unable to find threading support for PGI or Sun Studio

2008-07-29 Thread Mark Dixon
-- - Mark Dixon Email: m.c.di...@leeds.ac.uk HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2 9JT, UK -

Re: [OMPI users] Open MPI unable to find threading support for PGI or Sun Studio

2008-07-29 Thread Mark Dixon
nstudio12/bin/f90 -f77 -ftrap=%none conftestf.f conftest.o -o conftest -lnsl -lutil -lm -lpthread conftestf.f: MAIN fpthread: conftest.o: In function `pthreadtest_': conftest.c:(.text+0x41): undefined reference to `__builtin_expect' Any ideas? Cheers, Mark -- --

[OMPI users] Open MPI unable to find threading support for PGI or Sun Studio

2008-07-28 Thread Mark Dixon
answer to this in the FAQ or list archives. I've attached files showing the output of configure and my environment to this message. Is this expected? Thanks, Mark -- - Mark Dixon Email: m.c.di