Dear All,
Next feedback is about "coll_tuned_dynamic_rules_filename".
Incorrect algorithm is selected in following conditions:
1:"--mca coll_tuned_use_dynamic_rules 1" is set.
2:"--mca coll_tuned_dynamic_rules_filename" is set.
3: Collective communication which is written in 2, called >= 2GiB com
Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below). If
this fix addresses some hangs we are seeing on infiniband LANL might want a
1.4.6 rolled (or a faster rollout for 1.6.0).
-Nathan
-- Forwarded message --
List-Post: devel@lists.open-mpi.org
Date: Thu
...or in 1.5.5.
How soon will you be able to tell if it fixes some hangs?
On Mar 1, 2012, at 10:56 AM, Nathan Hjelm wrote:
> Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below).
> If this fix addresses some hangs we are seeing on infiniband LANL might want
> a 1.4.6 r
Hopefully by the end of the day - Nathan is testing now.
Sam
On Mar 1, 2012, at 11:36 AM, Jeffrey Squyres wrote:
> ...or in 1.5.5.
>
> How soon will you be able to tell if it fixes some hangs?
>
>
> On Mar 1, 2012, at 10:56 AM, Nathan Hjelm wrote:
>
>> Found a pretty nasty frag leak (and a
On Thu, 1 Mar 2012, Jeffrey Squyres wrote:
...or in 1.5.5.
Well, we want a "stable" release to deploy on the affected cluster.
How soon will you be able to tell if it fixes some hangs?
I will know in a couple of hours. Tested the fix in 1.4.5 and it appears to
elimiate my IMB hang! I stil
Good catch!!! That's indeed a quite nasty bug.
If it fixes the IB issues it justifies a 1.4.6 release.
Thanks,
george.
On Mar 1, 2012, at 10:56 , Nathan Hjelm wrote:
> Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below).
> If this fix addresses some hangs we are se
I can confirm that neither leak is causing my imb hang. Unless there is another
frag leak somewhere (haven't found one) the lockup was simply due to running
out of registered memory. So, I see no need to push for a 1.4.6 unless a btl
other than ugni hits the bug.
Setting an rcache limit doesn'
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 02/03/12 02:56, Nathan Hjelm wrote:
> Found a pretty nasty frag leak (and a minor one) in ob1 (see
> commit below). If this fix addresses some hangs we are seeing on
> infiniband LANL might want a 1.4.6 rolled (or a faster rollout for
> 1.6.0).
Wh