[OMPI devel] [PATCH]Incorrect algorithm choice using coll_tuned_dynamic_rules_filename (over 2GiB message)

2012-03-01 Thread Y.MATSUMOTO
Dear All, Next feedback is about "coll_tuned_dynamic_rules_filename". Incorrect algorithm is selected in following conditions: 1:"--mca coll_tuned_use_dynamic_rules 1" is set. 2:"--mca coll_tuned_dynamic_rules_filename" is set. 3: Collective communication which is written in 2, called >= 2GiB com

[OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Nathan Hjelm
Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below). If this fix addresses some hangs we are seeing on infiniband LANL might want a 1.4.6 rolled (or a faster rollout for 1.6.0). -Nathan -- Forwarded message -- List-Post: devel@lists.open-mpi.org Date: Thu

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Jeffrey Squyres
...or in 1.5.5. How soon will you be able to tell if it fixes some hangs? On Mar 1, 2012, at 10:56 AM, Nathan Hjelm wrote: > Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below). > If this fix addresses some hangs we are seeing on infiniband LANL might want > a 1.4.6 r

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Gutierrez, Samuel K
Hopefully by the end of the day - Nathan is testing now. Sam On Mar 1, 2012, at 11:36 AM, Jeffrey Squyres wrote: > ...or in 1.5.5. > > How soon will you be able to tell if it fixes some hangs? > > > On Mar 1, 2012, at 10:56 AM, Nathan Hjelm wrote: > >> Found a pretty nasty frag leak (and a

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Nathan Hjelm
On Thu, 1 Mar 2012, Jeffrey Squyres wrote: ...or in 1.5.5. Well, we want a "stable" release to deploy on the affected cluster. How soon will you be able to tell if it fixes some hangs? I will know in a couple of hours. Tested the fix in 1.4.5 and it appears to elimiate my IMB hang! I stil

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread George Bosilca
Good catch!!! That's indeed a quite nasty bug. If it fixes the IB issues it justifies a 1.4.6 release. Thanks, george. On Mar 1, 2012, at 10:56 , Nathan Hjelm wrote: > Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below). > If this fix addresses some hangs we are se

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Nathan Hjelm
I can confirm that neither leak is causing my imb hang. Unless there is another frag leak somewhere (haven't found one) the lockup was simply due to running out of registered memory. So, I see no need to push for a 1.4.6 unless a btl other than ugni hits the bug. Setting an rcache limit doesn'

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/03/12 02:56, Nathan Hjelm wrote: > Found a pretty nasty frag leak (and a minor one) in ob1 (see > commit below). If this fix addresses some hangs we are seeing on > infiniband LANL might want a 1.4.6 rolled (or a faster rollout for > 1.6.0). Wh