Dear Pak Lui

I can delete the (sge) job with qdel -f such that it disappears from the
job list but the application processes keep running, including the
shepherds. I have to kill them with -15

For some reason the kill -15 does not reach mpirun. (We use such a
parameter to mpirun on our myrinet mx nodes with mpich, that's why I
asked).

Just to confirm, there is no configure directive specific to gridengine
when building openmpi?

Thanks

henk

> -----Original Message-----
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Pak Lui
> Sent: 23 July 2007 15:16
> To: Open MPI Users
> Subject: Re: [OMPI users] sge qdel fails
> 
> Hi Henk,
> 
> The sge script should not require any extra parameter. The 
> qdel command should send the kill signal to mpirun and also 
> remove the SGE allocated tmp directory (in something like 
> /tmp/174.1.all.q/) which contains the OMPI session dir for 
> the running job, and in turns would cause orted and the user 
> processes to exit.
> 
> Maybe you could try qdel -f <jid> to force delete from the 
> sge_qmaster, in case when sge_execd does not respond to the 
> delete request by the sge_qmaster?
> 
> SLIM H.A. wrote:
> > I am using OpenMPI 1.2.3 with SGE 6.0u7 over InfiniBand (OFED 1.2), 
> > following the recommendation in the OpenMPI FAQ
> > 
> > http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
> > 
> > The job runs but when the user wants to delete the job with 
> the qdel 
> > command, this fails. Does the mpirun command
> > 
> > mpirun -np $NSLOTS ./exe
> > 
> > in the sge script require extra parameters?
> > 
> > Thanks for any advice
> > 
> > Henk
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> 
> - Pak Lui
> pak....@sun.com
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 

Reply via email to