Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-04-01 Thread Rahul Nabar
On Wed, Apr 1, 2009 at 1:13 AM, Ralph Castain wrote: > So I gather that by "direct" you mean that you don't get an allocation from > Maui before running the job, but for the other you do? Otherwise, OMPI > should detect the that it is running under Torque and automatically use the > Torque launche

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-04-01 Thread Ralph Castain
The difference you are seeing here indicates that the "direct" run is using the rsh launcher, while the other run is using the Torque launcher. So I gather that by "direct" you mean that you don't get an allocation from Maui before running the job, but for the other you do? Otherwise, OMP

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Rahul Nabar
2009/3/31 Ralph Castain : > I have no idea why your processes are crashing when run via Torque - are you > sure that the processes themselves crash? Are they segfaulting - if so, can > you use gdb to find out where? I have to admit I'm a newbiee with gdb. I am trying to recompile my code as "ifort

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Rahul Nabar
2009/3/31 Ralph Castain : > It is very hard to debug the problem with so little information. We > regularly run OMPI jobs on Torque without issue. Another small thing that I noticed. Not sure if it is relevant. When the job starts running there is an orte process. The args to this process are sli

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Rahul Nabar
2009/3/31 Ralph Castain : > > Information would be most helpful - the information we really need is > specified here: http://www.open-mpi.org/community/help/ Output of "ompi_info --all" is attached in a file. echo $LD_LIBRARY_PATH /usr/local/ompi-ifort/lib:/opt/intel/fce/10.1.018/lib:/opt/intel

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Rahul Nabar
2009/3/31 Ralph Castain : > It is very hard to debug the problem with so little information. We Thanks Ralph! I'm sorry my first post lacked enough specifics. I'll try my best to fill you guys in on as much debug info as I can. > regularly run OMPI jobs on Torque without issue. So do we. In fac

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Ralph Castain
It is very hard to debug the problem with so little information. We regularly run OMPI jobs on Torque without issue. Are you getting an allocation from somewhere for the nodes? If so, are you using Moab to get it? Do you have a $PBS_NODEFILE in your environment? I have no idea why your pr

[OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Rahul Nabar
I've a strange OpenMPI/Torque problem while trying to run a job on our Opteron-SC-1435 based cluster: Each node has 8 cpus. If I got to a node and run like so then the job works: mpirun -np 6 ${EXE_PATH}/${DACAPOEXE_PAR} ${ARGS} Same job if I submit through PBS/Torque then it starts running but