You are correct - the Sun errors are in a version prior to the
insertion of the SM changes. We didn't relabel the version to 1.3.2
until -after- those changes went in, so you have to look for anything
with an r number >= 20839.
The sif errors are all in that group - I would suggest starting
Ralph Castain wrote:
It looks like the SM revisions we inserted into 1.3.2 are a great
detector for shared memory init failures - it segfaulted 143 times
last night on IU's sif computer, 34 times on Sun/Linux, and 3 times
on Sun/SunOS...almost every single time due to "Address not mapped"
Ralph Castain wrote:
Hi folks
Er, perhaps pronounced "Eugene". :^(
It looks like the SM revisions we inserted into 1.3.2 are a great
detector for shared memory init failures
How delicately put! I appreciate the gentleness.
- it segfaulted 143 times last night on IU's sif computer, 34
It it was just a few kinks actually. I think the the bitmap type moved from
orte to opal, then I think the opal_hash_table functions changed slightly
and also I think the modex stuff was called something like pml_modex where
it's now ompi_modex. There were a few extra functions in the module
descri
What is the error that you are getting from compilation failure?
Lenny.
On 3/23/09, Timothy Hayes wrote:
>
> That's a relief to know, although I'm still a bit concerned. I'm looking at
> the code for the OpenMPI 1.3 trunk and in the ob1 component I can see the
> following sequence:
>
> mca_pml_o
Hi folks
It looks like the SM revisions we inserted into 1.3.2 are a great
detector for shared memory init failures - it segfaulted 143 times
last night on IU's sif computer, 34 times on Sun/Linux, and 3 times on
Sun/SunOS...almost every single time due to "Address not mapped"
errors in t