On Thu, Feb 25, 2016 at 05:09:41PM -0500, [email protected] wrote: > So, the lockfile contains the jobID, or something similar? > > Some jobs use a file as a flag (ie. checking the existence or content), but > we've largely avoided POSIX file locking (the mixture of NFS, GPFS, & CIFS > here should work....but it gets complicated quickly).
The folks who have used this technique most successfully track job status in a Postgres database, but having the job ID in a file would be OK too. Note that if you run lots of jobs, older Grid Engine versions will wraparound job IDs at 10 million. Our newer UGE clusters don't have this limitation, but I'm unsure when this changed. > Our better 'chained' jobs use the SGE "hold" feature, some are launched as > array jobs, and some (the really ulgy ones) loop within a shell script, > checking for files that indicate that a prerequisite job finished, > checking for errors in the prereq, or loop over 'qstat' checking if a > specific jobid has completed. Yep, we recommend folks use -hold_jid whenever possible. Some of our labs also use DRMAA but obviously that introduces even more complexity. -- -- Skylar Thompson ([email protected]) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
