Il 2013-10-11 17:21 Sytchev, Ilya ha scritto:
On 9/12/13 10:35 AM, "Peter Cock" <p.j.a.c...@googlemail.com> wrote:
On Thu, Sep 12, 2013 at 2:01 PM, Mathieu Bahin
<mathieu.ba...@irisa.fr>
wrote:
Hi all,
We have been developing our own Galaxy instance for a while now. We
have a
cluster on which the job are sent to be executed, it is managed
through
SGE.
Usually, communication between SGE and DRMAA is ok and we don't
have any
problem with that.
When a job is deleted by the user, most of the times, the job
disappears but
sometimes, we don't know why, the job stays and has the status 'dr'
within
SGE. If we don't kill it 'manually', it stays forever. It is not
always
the
same tools which produces this error.
Have you any idea why how manage it ?
I have noticed problem with our DRMMA/SGE setup where a
user can cancel a large job (using the job splitter in at least some
cases), but Galaxy does not seem to cancel the jobs on the cluster.
I've not tried to diagnose this yet - it could be a similar issue
though.
Also, in our DRMAA/LSF setup (using a fork of the latest galaxy-dist)
jobs
generated by the current workflow step continue running on the
cluster
after history is deleted.
Ilya
Hi Ilya,
I also see this behaviour with DRMAA/GridEngine.
I think this has been already reported:
https://trello.com/c/1whC9did/245-currently-running-jobs-in-deleted-histories-should-be-killed
Please upvote it!
Best,
Nicola
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/