Is this work an FAQ item? I.e., if specific versions of PBS Pro are broken, should we make that google-able on our FAQ, at least?
On Jul 27, 2011, at 2:21 PM, Ralph Castain wrote: > Great - thanks! > > On Jul 27, 2011, at 12:16 PM, Justin Wood wrote: > >> I heard back from my Altair contact this morning. He told me that they did >> in fact make a change in some version of 10.x that broke this. They don't >> have a workaround for v10, but he said it was fixed in v11.x. >> >> I built OpenMPI 1.5.3 this morning with PBSPro v11.0, and it works fine. I >> don't get any segfaults. >> >> -Justin. >> >> On 07/26/2011 05:49 PM, Ralph Castain wrote: >>> I don't believe we ever got anywhere with this due to lack of response. If >>> you get some info on what happened to tm_init, please pass it along. >>> >>> Best guess: something changed in a recent PBS Pro release. Since none of us >>> have access to it, we don't know what's going on. :-( >>> >>> >>> On Jul 26, 2011, at 10:10 AM, Wood, Justin Contractor, SAIC wrote: >>> >>>> I'm having a problem using OpenMPI under PBS Pro 10.4. I tried both 1.4.3 >>>> and 1.5.3, both behave the same. I'm able to run just fine if I don't use >>>> PBS and go direct to the nodes. Also, if I run under PBS and use only 1 >>>> node, it works fine, but as soon as I span nodes, I get the following: >>>> >>>> [a4ou-n501:07366] *** Process received signal *** >>>> [a4ou-n501:07366] Signal: Segmentation fault (11) >>>> [a4ou-n501:07366] Signal code: Address not mapped (1) >>>> [a4ou-n501:07366] Failing at address: 0x3f >>>> [a4ou-n501:07366] [ 0] /lib64/libpthread.so.0 [0x3f2b20eb10] >>>> [a4ou-n501:07366] [ 1] >>>> /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0(discui_+0x84) [0x2affa453765c] >>>> [a4ou-n501:07366] [ 2] >>>> /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0(diswsi+0xc3) [0x2affa4534c6f] >>>> [a4ou-n501:07366] [ 3] /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0 >>>> [0x2affa453290c] >>>> [a4ou-n501:07366] [ 4] >>>> /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0(tm_init+0x1fe) [0x2affa4532bf8] >>>> [a4ou-n501:07366] [ 5] /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0 >>>> [0x2affa452691c] >>>> [a4ou-n501:07366] [ 6] mpirun [0x404c17] >>>> [a4ou-n501:07366] [ 7] mpirun [0x403e28] >>>> [a4ou-n501:07366] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) >>>> [0x3f2a61d994] >>>> [a4ou-n501:07366] [ 9] mpirun [0x403d59] >>>> [a4ou-n501:07366] *** End of error message *** >>>> Segmentation fault >>>> >>>> I searched the archives and found a similar issue from last year: >>>> >>>> http://www.open-mpi.org/community/lists/users/2010/02/12084.php >>>> >>>> The last update I saw was that someone was going to contact Altair and >>>> have them look at why it was failing to do the tm_init. Does anyone have >>>> an update to this, and has anyone been able to run successfully using >>>> recent versions of PBSPro? I've also contacted our rep at Altair, but he >>>> hasn't responded yet. >>>> >>>> Thanks, Justin. >>>> >>>> Justin Wood >>>> Systems Engineer >>>> FNMOC | SAIC >>>> 7 Grace Hopper, Stop 1 >>>> Monterey, CA >>>> justin.g.wood....@navy.mil >>>> justin.g.w...@saic.com >>>> office: 831.656.4671 >>>> mobile: 831.869.1576 >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> -- >> Justin Wood >> Systems Engineer >> FNMOC | SAIC >> 7 Grace Hopper, Stop 1 >> Monterey, CA >> justin.g.wood....@navy.mil >> justin.g.w...@saic.com >> office: 831.656.4671 >> mobile: 831.869.1576 >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/