I have been struggling trying to get a usable build of openmpi on Mac OSX
Mavericks (10.9.1). I can get openmpi to configure and build without
error, but have problems after that which depend on the openmpi version.
With 1.6.5, make check fails the opal_datatype_test, ddt_test, and ddt_raw
tests.
I neglected in my earlier post to attach the small C code that the hdf5
folks supplied; it is attached here.
On Wed, Jan 15, 2014 at 10:04 AM, Ronald Cohen wrote:
> I have been struggling trying to get a usable build of openmpi on Mac OSX
> Mavericks (10.9.1). I can get openmpi to con
me your sample test code?
>
>
> On Wed, Jan 15, 2014 at 10:34 AM, Ralph Castain wrote:
>
>> I regularly build on Mavericks and run without problem, though I haven't
>> tried a parallel IO app. I'll give yours a try later, when I get back to my
>> Mac.
>
a fairly regular basis.
>>
>> As for the opal_bitmap test: it wouldn't surprise me if that one was
>> stale. I can check on it later tonight, but I'd suspect that the test is
>> bad as we use that class in the code base and haven't seen an issue.
>>
>&g
someone explain that?
On Wed, Jan 15, 2014 at 1:16 PM, Ronald Cohen wrote:
> Aha. I guess I didn't know what the io-romio option does. If you look
> at my config.log you will see my configure line included
> --disable-io-romio.Guess I should change --disable to --enable.
&
> why you would get that behavior other than a race condition. Afraid that
> code path is foreign to me, but perhaps one of the folks in the MPI-IO area
> can respond
>
>
> On Jan 15, 2014, at 4:26 PM, Ronald Cohen wrote:
>
> Update: I reconfigured with enable_io_romio=yes,
dea, but hopefully someone else here with experience
> with HDF5 can chime in?
>
>
> On Jan 17, 2014, at 9:03 AM, Ronald Cohen wrote:
>
> Still a timely response, thank you.The particular problem I noted
> hasn't recurred; for reasons I will explain shortly I had to rebu
tee is not provided, then perhaps a barrier between the
> close+delete and the next file_open should be sufficient to avoid the
> race...?
>
>
> On Jan 15, 2014, at 7:26 PM, Ronald Cohen wrote:
>
> > Update: I reconfigured with enable_io_romio=yes, and this time -- mostly
ve
> 4b. totally disable the dlopen code in OMPI, meaning that OMPI won't even
> try to open DSOs
>
> I doubt any of this really matters to the issues you're seeing; I just
> wanted to explain these options because I saw several of them mentioned in
> your mails.
>
>
I figured that.
On Fri, Jan 17, 2014 at 10:26 AM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:
> On Jan 17, 2014, at 1:17 PM, Jeff Squyres (jsquyres)
> wrote:
>
> > 3. --enable-shared is *not* implied by --enable-static. So if you
> --enable-static without --disable-shared, you're buil
4 rc1.
---
-Barbara
On Fri, Jan 17, 2014 at 9:39 AM, Ronald Cohen wrote:
> Thanks, I've just gotten an email with some suggestions (and promise of
> more help) from the HDF5 support team. I will report back here, as it may
> be of interest to others trying to build hdf5 on mavericks.
&
ROMIO disabled
> - test (sometimes) failing when you had ROMIO disabled
> - compiling / linking issues
>
> ?
>
>
> On Jan 17, 2014, at 1:50 PM, Ronald Cohen wrote:
>
> > Hello Ralph and others, I just got the following back from the HDF-5
> support group, suggesti
is possible this is a
> ROMIO bug that we have picked up. I've asked someone to check upstream
> about it.
>
>
> On Jan 17, 2014, at 12:02 PM, Ronald Cohen wrote:
>
> Sorry, too many entries in this thread, I guess. My general goal is to
> get a working parall
iBand
Ron
--
Professor Dr. Ronald Cohen
Ludwig Maximilians Universität
Theresienstrasse 41 Room 207
Department für Geo- und Umweltwissenschaften
München
80333
Deutschland
ronald.co...@min.uni-muenchen.de
skype: ronaldcohen
+49 (0) 89 74567980
---
Ronald Cohen
Geophysical Laboratory
Carnegie Instit
underlying Open
MPI's C functionality
is not correct anymore--gfortran 6.0.0 now includes array subsections
Not sure about direct passthru.
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Wed, Mar 23, 2016 at 7:54 AM, Ronald Cohen wrote:
> I get 100 GFLOP
about gcc 6.0.0
> now this is supported on a free compiler
> (cray and intel already support that, but they are commercial compilers),
> I will resume my work on supporting this
>
> Cheers,
>
> Gilles
>
> On Wednesday, March 23, 2016, Ronald Cohen wrote:
>>
>>
es?
>
> Josh
>
> On Wed, Mar 23, 2016 at 8:47 AM, Ronald Cohen wrote:
>>
>> Thank you! Here are the answers:
>>
>> I did not try a previous release of gcc.
>> I built from a tarball.
>> What should I do about the iirc issue--how should I check?
&
; and run this test in the same environment than your app (e.g. via a batch
> manager if applicable)
>
> if you do not get the performance you expect, then I suggest you try the
> stock gcc compiler shipped with your distro and see if it helps.
>
> Cheers,
>
> Gilles
>
>
eak performance, then
> you can run on two nodes.
>
> Cheers,
>
> Gilles
>
> On Wednesday, March 23, 2016, Ronald Cohen wrote:
>>
>> The configure line was simply:
>>
>> ./configure --prefix=/home/rcohen
>>
>> when I run:
>>
>>
;> hpl is known to scale, assuming the data is big enough, you use an
>> optimized blas, and the right number of openmp threads
>> (e.g. if you run 8 tasks per node, the you can have up to 2 openmp
>> threads, but if you use 8 or 16 threads, then performance will be worst)
>
So I want to thank you so much! My benchmark for my actual application
went from 5052 seconds to 266 seconds with this simple fix!
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Wed, Mar 23, 2016 at 11:00 AM, Ronald Cohen wrote:
> Dear Gilles,
>
>
...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
---
Ronald Cohen
Geophysical Laboratory
Carnegie Institution
5251 Broad Branch Rd., N.W.
Washington, D.C. 20015
ill be ranked by node instead of consecutively within a node.
>
>
>
>> On Mar 25, 2016, at 9:30 AM, Ronald Cohen wrote:
>>
>> I am using
>>
>> mpirun --map-by ppr:4:node -n 16
>>
>> and this loads the processes in round robin fashion. This seems to be
nodes=4:ppn=16,pmem=1gb
mpirun --map-by ppr:4:node -n 16
it is 368 seconds.
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Fri, Mar 25, 2016 at 12:43 PM, Ralph Castain wrote:
>
>> On Mar 25, 2016, at 9:40 AM, Ronald Cohen wrote:
>>
>> Tha
:
>
>> On Mar 25, 2016, at 9:59 AM, Ronald Cohen wrote:
>>
>> It is very strange but my program runs slower with any of these
>> choices than if IO simply use:
>>
>> mpirun -n 16
>> with
>> #PBS -l
>> nodes=n013.cluster.com:ppn=4+n014.cluste
itter: @recohen3
On Fri, Mar 25, 2016 at 1:17 PM, Ronald Cohen wrote:
> Actually there was the same number of procs per node in each case. I
> verified this by logging into the nodes while they were running--in
> both cases 4 per node .
>
> Ron
>
> ---
> Ron Cohen
>
--report-bindings didn't report anything
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Fri, Mar 25, 2016 at 1:24 PM, Ronald Cohen wrote:
> —display-allocation an
> didn't seem to give useful information:
>
> ==
twitter: @recohen3
On Fri, Mar 25, 2016 at 1:27 PM, Ronald Cohen wrote:
> --report-bindings didn't report anything
> ---
> Ron Cohen
> recoh...@gmail.com
> skypename: ronaldcohen
> twitter: @recohen3
>
>
> On Fri, Mar 25, 2016 at 1:24 PM, Ronald Cohen wrote:
>>
ocket 1[core
> 11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
> 1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
> [./././././././.][B/B/B/B/B/B/B/B]
> [n002
>
> etc?
>
> ---
> Ron Cohen
> recoh...@gmail.com
> skypename: ronaldcohen
> twitter: @rec
it thinks we have 16 slots/node, so if you just use “mpirun
> -np 16”, you should wind up with all the procs on one node
>
>
> On Mar 25, 2016, at 10:24 AM, Ronald Cohen wrote:
>
> —display-allocation an
> didn't seem to give useful information:
>
> =
1.10.2
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Fri, Mar 25, 2016 at 1:30 PM, Ralph Castain wrote:
> Hmmm…what version of OMPI are you using?
>
>
> On Mar 25, 2016, at 10:27 AM, Ronald Cohen wrote:
>
> --report-bindings didn
k the procs and
> assign them to the correct number of cores. See if that helps
>
>> On Mar 25, 2016, at 10:38 AM, Ronald Cohen wrote:
>>
>> 1.10.2
>>
>> Ron
>>
>> ---
>> Ron Cohen
>> recoh...@gmail.com
>> skypename: ronaldcohen
>
or is it mpirun -map-by core:pe=8 -n 16 ?
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Fri, Mar 25, 2016 at 2:10 PM, Ronald Cohen wrote:
> Thank you--I looked on the man page and it is not clear to me what
> pe=2 does. Is that the number of threads? S
line 579
H2O-64_REC.log (END)
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Fri, Mar 25, 2016 at 2:11 PM, Ronald Cohen wrote:
> or is it mpirun -map-by core:pe=8 -n 16 ?
>
> ---
> Ron Cohen
> recoh...@gmail.com
> skypename: ronaldcohen
> twitt
at you asked for a non-integer
> multiple of cores - i.e., if you have 32 cores on a node, and you ask for
> pe=6, we will wind up leaving two cores idle.
>
> HTH
> Ralph
>
> On Mar 25, 2016, at 11:11 AM, Ronald Cohen wrote:
>
> or is it mpirun -map-by core:pe=8 -n 16 ?
>
[core 11[hwt 0]]: [./././././././.][././B/B/./././.]
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Fri, Mar 25, 2016 at 2:32 PM, Ronald Cohen wrote:
> So it seems my
> -map-by core:pe=2 -n 32
> should have worked . I would have 32 procs with 2 on each,
]]: [./././././././.][././././././B/B]
[n003.cluster.com:29842] MCW rank 16 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]]: [B/B/./././././.][./././././././.]
[n002.cluster.com:32210] MCW ra
...
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Fri, Mar 25, 2016 at 3:13 PM, Ronald Cohen
you can use the “rank-by” option to maintain the location and
> binding, but change the assigned MCW ranks to align with your communication
> pattern.
>
> HTH
> Ralph
>
>
>
> On Mar 25, 2016, at 12:28 PM, Ronald Cohen wrote:
>
> So I have been experimenting with diff
38 matches
Mail list logo