Jeff,

OK ... I rebuilt without --with-tm= and as predicted my test case
runs (I left the IB flags in).  I then ran a job with just:

pbsdsh hostname

on 16 nodes and that also worked.  I know that 1.4.1 works although
it was build pointing into the old PBS Pro version tree explicitly.  I
have checked and rechecked the environmental variables and everything
else that could lead to some mixed-up version cross referencing.
I am tempted to build 1.4.2 with the explicit -with-tm= version path
instead of using the symlink to default, but I cannot think of a logical
reason why that should do anything.

I have also reported this to the PBS Pro support folks.

Thanks for the suggestions,

rbw


   Richard Walsh
   Parallel Applications and Systems Manager
   CUNY HPC Center, Staten Island, NY
   718-982-3319
   612-382-4620

   Mighty the Wizard
   Who found me at sunrise
   Sleeping, and woke me
   And learn'd me Magic!
________________________________________
From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of Jeff 
Squyres [jsquy...@cisco.com]
Sent: Thursday, June 10, 2010 6:34 PM
To: Open MPI Users
Subject: Re: [OMPI users] Address not mapped segmentation fault with1.4.2       
...

On Jun 10, 2010, at 5:49 PM, Richard Walsh wrote:

> OK ... so if I follow your lead and build a version without PBS --tm= 
> integration
> and it works, I should be able to report this as an incompatibility bug 
> between
> the latest version of PBS Pro (10.2.0.93147) and the latest version of OpenMPI
> (1.4.2). right?  Do I report that you to my friends at OpenMPI or my friends 
> at
> PBS Pro (Altair), or both?

I'd say both.

But it would be quite surprising if tm_init() it wholly broken -- it's the very 
first function that has to be invoked.

I'm not a PBS user, so I don't know/remember the PBS commands offhand, but I 
have a dim recollection of a few PBS-provided TM-using tools (pbsdsh or 
somesuch?).  You might want to try those, too, and see if they work/fail.

If it really is a problem, I'm guessing it'll be a compiler/linker issue 
somehow... (e.g., how we're compiling/linking is not matching the 
compilation/linker style of the TM library)  That's a SWAG.  :-)

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Think green before you print this email.

Reply via email to