Hi

we are running sge 6.2u5. I am trying to restart jobs via checkpointing. 
On one of our clusters that works fine - jobs is suspended via the 
suspend command, is stopped, rescheduled in the queue and restarted if 
resources are available.

With apparently the same setup of the sge on a second cluster my jobs 
are rescheduled but do not get started. qstat -sj shows
"cannot run on host XXX until clean up of an previous run has finished"

If the job is deleted from the queue and restarted manually works perfect.

Is there a way to get a more elaborate error message and to find out 
what exactly goes wrong with the cleanup?

Juryk


This e-mail and any attachment thereto may contain confidential information 
and/or information protected by intellectual property rights for the exclusive 
attention of the intended addressees named above. Any access of third parties 
to this e-mail is unauthorised. Any use of this e-mail by unintended recipients 
such as total or partial copying, distribution, disclosure etc. is prohibited 
and may be unlawful. When addressed to our clients the content of this e-mail 
is subject to the General Terms and Conditions of GL's Group of Companies 
applicable at the date of this e-mail.
If you have received this e-mail in error, please notify the sender either by 
telephone or by e-mail and delete the material from any computer.
GL's Group of Companies does not warrant and/or guarantee that this message at 
the moment of receipt is authentic, correct and its communication free of 
errors, interruption etc.
FutureShip GmbH, HRB 106781 AG HH, VAT Reg. No. DE263937825
Geschäftsführer (CEO): Volker Höppner, Henning Kinkhorst, Stefan Deucker


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to