Hi folks
I have bad news. The 1.8.7 tarball is incorrect - I grabbed the wrong one, and
it is missing several commits. As if that isn’t enough, I’ve been informed that
we also missed moving some critical fixes over to 1.8 in the MPI_Finalize area,
and so users of PSM are getting segfaults. This
I admit to having lost track of the discussion split among the various PRs
and this email thread.
I have the following three system to test on:
#1) ofi is the only mtl component which can build.
#2) Both the ofi and portals4 mtl conponents build
#3) Both the psm and mxm mtl components build
I hav
Howard,
Not sure if the "--mca mtl_base_verbose 10" output is still needed, but
I've attached it in case it is.
-Paul
On Fri, Jul 24, 2015 at 7:26 AM, Howard Pritchard
wrote:
> Paul
>
> Could you rerun with --mca mtl_base_verbose 10 added to cmd line and send
> output?
>
> Howard
>
> -
Hmmm...the most likely cause is that I generated the tag late - not immediately
upon release. I tried to get the sha correct, but probably missed it.
It’s possible that other changes came in afterwards, but I can take a look and
see. The oob connection patch sounds strange, and I thought we had
Why the contents of the 1.8.7 release tarball versus the v1.8.7 tag in
ompi-release repo differ? Any chance this was a mistake and the
release tarball was generated with the wrong tree?
Of course I do not care about VERSION, but there are two files related
to RMA that are different. The release ta
Hi Folks,
Should we do something better than what is done currently in the
mca_pml_cm_component_init method around lines 158-162?
That's what's causing a bunch of problems right now in 1.10.
I'd like to see a better approach taken in the v2.x
Howard
Hi Jeff,
Nathan and I think this is generic to all the mtl's and masked by the stuff
in the cm select method for upping the priority
of the mtl. We'd see this behavior for all mtl's if this priority upping
code wasn't there and we fell back to ob1.
Howard
2015-07-24 9:12 GMT-06:00 Jeff Squyres
Using a debug build of 1.8.7, I'm still getting this malloc(0) warning:
malloc debug: Request for 0 bytes (coll_libnbc_ireduce_scatter_block.c, 67)
The simple code below should reproduce it:
$ cat ireduce_scatter_block.c
#include
int main(int argc, char *argv[])
{
MPI_Request request;
MPI_I
I think Ralph answered this question: if you register a progress function but
then get your component unloaded without un-registering the progress
function... kaboom.
> On Jul 24, 2015, at 10:37 AM, Howard Pritchard wrote:
>
> Jeff
>
> I was wrong about this. all the mtls except for portals
Jeff
I was wrong about this. all the mtls except for portals4 register with
opal progress in their comp init.
I dont see how this is a problem though as base select only invokes comp
init on the selected mtl.
Howard
--
sent from my smart phonr so no good type.
Howard
On Jul 24, 2015
Glancing at the code, I believe I see the problem. The OFI MTL component
registers an opal progress function during init, but the CM PML is not the one
ultimately selected. Thus, the CM PML has its finalize called and is unloaded.
During finalize, CM closes the MTL framework. This in turn calls
Paul
Could you rerun with --mca mtl_base_verbose 10 added to cmd line and send
output?
Howard
--
sent from my smart phonr so no good type.
Howard
On Jul 23, 2015 6:06 PM, "Paul Hargrove" wrote:
> Yohann,
>
> With PR409 as it stands right now (commit 6daef310) I see no change to the
>
Yohann --
Can you have a look?
> On Jul 24, 2015, at 10:15 AM, Howard Pritchard wrote:
>
> looks like ofi mtl is being naughty. its tje onlx mtl which registers with
> opal progress in component init method.
>
> --
>
> sent from my smart phonr so no good type.
>
> Howard
>
> On J
looks like ofi mtl is being naughty. its tje onlx mtl which registers with
opal progress in component init method.
--
sent from my smart phonr so no good type.
Howard
On Jul 23, 2015 7:03 PM, "Ralph Castain" wrote:
> It looks like one of the MTL components is registering a progress ca
14 matches
Mail list logo