Hi William.
It is a large job array witch each task ending in bio or free64 queue.
I think the issue is that a task that *starts* on free64 queue picks up
the h_rt limit and when it is restarted and lands on bio queue which has
no time-limit, it get's killed because the original h_rt sticks.
Not sure if this answers your question?
Joseph
On 05/29/2015 05:12 AM, William Hay wrote:
On Thu, 28 May 2015 19:27:07 +0000
Joseph Farran <[email protected]> wrote:
Hi all.
I am not sure if this is a bug or the way Grid Engine works.
We have several queues our users submit jobs to. One of the queues
"free64" has a 3-day wall-clock limit:
$ qconf -sq free64 | grep "_rt"
s_rt 72:00:00
h_rt 72:05:00
While other queue "bio" does not:
$ qconf -sq bio | grep "_rt"
s_rt INFINITY
h_rt INFINITY
When a user submits a job to both queues "-q free64,bio", jobs that
run longer than 3 days are killed whether they land on "free64" or
"bio" queue. Why are jobs that land on the "bio" queue being
killed after 3 days?
Are you sure the whole job is in the bio queue? Might a slave task be
in the free64 queue?
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users