[OMPI devel] Replacing poll()

2012-03-02 Thread Alex Margolin
Hi, I'm trying to replace the poll() function with mine (say poll2() in poll2.c), and I got some building errors. This is after I copied poll2.c into opal/events/ and added it in the sources list in Makefile.am in that folder. ... Making all in tools/wrappers make[2]: Entering directory `/hom

Re: [OMPI devel] Replacing poll()

2012-03-02 Thread Alex Margolin
I added my poll2.c to config/ompi_setup_libevent.m4 next to poll.c and was able to build successfully, but even if poll2() just (prints and) calls poll() - I get the following error (btl is specified to avoid loading my module at this time): alex@singularity:~/huji/benchmarks/simple$ ~/huji/om

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread Matthias Jurenz
SORRY, it was obviously a big mistake by me. :-( Open MPI 1.5.5 was built with LSF support, so when starting an LSF job it's necessary to request at least the number of tasks/cores as used for the subsequent mpirun command. That was not the case - I forgot the bsub's '-n' option to specify the

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread Matthias Jurenz
To exclude a possible bug within the LSF component, I rebuilt Open MPI without support for LSF (--without-lsf). -> It makes no difference - the latency is still bad: ~1.1us. Matthias On Friday 02 March 2012 13:50:13 Matthias Jurenz wrote: > SORRY, it was obviously a big mistake by me. :-( > >

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread Jeffrey Squyres
Ok. Good that there's no oversubscription bug, at least. :-) Did you see my off-list mail to you yesterday about building with an external copy of hwloc 1.4 to see if that helps? On Mar 2, 2012, at 8:26 AM, Matthias Jurenz wrote: > To exclude a possible bug within the LSF component, I rebuil

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread Jeffrey Squyres
Hah! I just saw your ticket about how --with-hwloc=/path/to/install is broken in 1.5.5. So -- let me go look in to that... On Mar 2, 2012, at 8:58 AM, Jeffrey Squyres wrote: > Ok. Good that there's no oversubscription bug, at least. :-) > > Did you see my off-list mail to you yesterday abo

Re: [OMPI devel] Replacing poll()

2012-03-02 Thread Jeffrey Squyres
Note that the OMPI 1.4.x series is about to be retired. If you're doing new stuff, I'd advise you to be working with the Open MPI SVN trunk. In the trunk, we've changed how we build libevent, so if you're adding to it, you probably want to be working there for max forward-compatibility. That

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread Matthias Jurenz
On Friday 02 March 2012 14:58:45 Jeffrey Squyres wrote: > Ok. Good that there's no oversubscription bug, at least. :-) > > Did you see my off-list mail to you yesterday about building with an > external copy of hwloc 1.4 to see if that helps? Yes, I did - I answered as well. Our mail server seem

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread Matthias Jurenz
In thanks to the OTPO tool, I figured out that setting the MCA parameter btl_sm_fifo_lazy_free to 1 (default is 120) improves the latency significantly: 0,88µs But somehow I get the feeling that this doesn't eliminate the actual problem... Matthias On Friday 02 March 2012 15:37:03 Matthias Ju

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread George Bosilca
Please do a "ompi_info --param btl sm" on your environment. The lazy_free direct the internals of the SM BTL not to release the memory fragments used to communicate until the lazy limit is reached. The default value was deemed as reasonable a while back when the number of default fragments was l

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-02 Thread Hjelm, Nathan T
They symptom is that the process hangs forever. Its difficult to differentiate this bug and simply running out of registered memory. The bug is hit if the pml is using the mpi_leave_pinned protocol and the btl returns an error from its send function. -Nathan ___

Re: [OMPI devel] Replacing poll()

2012-03-02 Thread Alex Margolin
On 03/02/2012 04:33 PM, Jeffrey Squyres wrote: Note that the OMPI 1.4.x series is about to be retired. If you're doing new stuff, I'd advise you to be working with the Open MPI SVN trunk. In the trunk, we've changed how we build libevent, so if you're adding to it, you probably want to be w

Re: [OMPI devel] Replacing poll()

2012-03-02 Thread Jeffrey Squyres
Give your btl progress function. It'll get called quite frequently. Look at the "progress" section in btl.h. Progress threads don't work yet, but the btl_progress function will get called by the PML quite frequently. It's how BTL's like openib progress their outstanding message passing. On

Re: [OMPI devel] [PATCH]Incorrect algorithm choice using coll_tuned_dynamic_rules_filename (over 2GiB message)

2012-03-02 Thread George Bosilca
Yuki, I applied your patch and added your copyright in the corresponding files (r26080). I will make a CMR for the 1.4 and 1.5. However, as you might have noticed we're trying to close the 1.4 and move forward. Thanks, george. On Mar 1, 2012, at 02:33 , Y.MATSUMOTO wrote: > Dear All, >

Re: [OMPI devel] [PATCH]Incorrect algorithm choice using coll_tuned_dynamic_rules_filename (over 2GiB message)

2012-03-02 Thread George Bosilca
Yuki, r26084 should fixes the issue with the dynamic rules file in the trunk. Thanks for reporting it. george. On Mar 1, 2012, at 02:33 , Y.MATSUMOTO wrote: > But, we found problem when over 2GiB message is written in rulefile as > "message size". > (over 2GiB message cannot read correctly.