We have pipelines that are driven by a qsub at the end of a batch script.
Error tracking is an issue but sometimes it's easier to do that than to
engineer a raft of job dependencies. As you note, concurrency can be an
issue, but there are a number of ways to deal with that:

* Lock file in a POSIX-compliant filesystem
* Semaphore in a network-accessible database

You can prevent dead jobs from stalling other jobs by tying the
lock/semaphore back to a job and ensuring that it's still running.

On Thu, Feb 25, 2016 at 11:16:49PM +0200, Ben Daniel Pere wrote:
> Where I work, we have jobs that submit jobs that submit jobs.. this could
> potentially cause a deadlock but we're somehow (probably luck) manage to
> live with it.. I'm wondering if that's a reasonable practice and if not if
> you can suggest a better way to do what we do..
> 
> Example:
> 
> we have these 3 tasks:
> 
> - "analyze.day" job analyzed a day of data and returns some output
> - "analyze.month" job sends "analyze.day" jobs for a whole month and
> outputs summary
> - "analyze.year" job sends  "analyze.month" jobs for a whole year and
> outputs summary
> 
> usually people run analyze.day everyday on previous day but sometimes they
> test their new algorithm on a whole year so they dispatch analyze.year
> which dispatched analyze.month which dispatched analyze.day..
> We created a "dispatching" queue which is the only queue we allow
> submitting jobs from but since both analyze.year and analyze.month need to
> run there (both dispatch tasks) we could end up with a dead lock
> (theoretically, lots of analyze.year running together taking all
> dispatching queue slots and not leaving room for analyze.month tasks which
> they will forever wait for), also besides dispatching they also do some
> logic so it's a strange animal, this "dispatching" queue..
> 
> What's the "correct" practice here?

> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


-- 
-- Skylar Thompson ([email protected])
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to