On Wed, Apr 1, 2009 at 1:13 AM, Ralph Castain wrote:
> So I gather that by "direct" you mean that you don't get an allocation from
> Maui before running the job, but for the other you do? Otherwise, OMPI
> should detect the that it is running under Torque and automatically use the
The difference you are seeing here indicates that the "direct" run is
using the rsh launcher, while the other run is using the Torque
launcher.
So I gather that by "direct" you mean that you don't get an allocation
from Maui before running the job, but for the other you do? Otherwise,
2009/3/31 Ralph Castain :
> I have no idea why your processes are crashing when run via Torque - are you
> sure that the processes themselves crash? Are they segfaulting - if so, can
> you use gdb to find out where?
I have to admit I'm a newbiee with gdb. I am trying to recompile
2009/3/31 Ralph Castain :
> It is very hard to debug the problem with so little information. We
> regularly run OMPI jobs on Torque without issue.
Another small thing that I noticed. Not sure if it is relevant.
When the job starts running there is an orte process. The args to this
2009/3/31 Ralph Castain :
>
> Information would be most helpful - the information we really need is
> specified here: http://www.open-mpi.org/community/help/
Output of "ompi_info --all" is attached in a file.
echo $LD_LIBRARY_PATH
2009/3/31 Ralph Castain :
> It is very hard to debug the problem with so little information. We
Thanks Ralph! I'm sorry my first post lacked enough specifics. I'll
try my best to fill you guys in on as much debug info as I can.
> regularly run OMPI jobs on Torque without issue.
It is very hard to debug the problem with so little information. We
regularly run OMPI jobs on Torque without issue.
Are you getting an allocation from somewhere for the nodes? If so, are
you using Moab to get it? Do you have a $PBS_NODEFILE in your
environment?
I have no idea why your