Re: [galaxy-dev] Stalled upload jobs under "Admin", "Manage jobs"
On Apr 26, 2012, at 11:03 AM, Hans-Rudolf Hotz wrote: > > > On 04/26/2012 04:47 PM, Peter Cock wrote: >> On Fri, Mar 16, 2012 at 11:00 AM, Peter Cock >> wrote: >>> On Mon, Feb 13, 2012 at 5:02 PM, Nate Coraor wrote: On Feb 10, 2012, at 6:47 AM, Peter Cock wrote: > Hello all, > > I've noticed we have about a dozen stalled upload jobs on our server > from several users. e.g. > > Job IDUserLast Update ToolState Command LineJob > Runner PID/Cluster ID > 2352 21 hours agoupload1 upload NoneNoneNone > ... > 2339 19 hours agoupload1 upload NoneNoneNone > > The job numbers are consecutive (2339 to 2352) and reflect a problem > for a couple of hours yesterday morning. I believe this was due to the > underlying file system being unmounted (without restarting Galaxy), > and at the time restarting Galaxy fixed uploading files. Test jobs > since then have completed normally - but these zombie jobs remain. > > Using the "Stop jobs" option does not clear these dead upload jobs. > > Restarting the Galaxy server does not clear them either. > > This is our production server and was running galaxy-dist, changeset > 5743:720455407d1c - which I have now updated to the current release, > 6621:26920e20157f - which makes no difference to these stalled jobs. > > Does anyone have any insight into what might be wrong, and how to get > rid of these zombie tasks? Hi Peter, Are you using the nginx upload module? There's no way to fix these from within Galaxy, unfortunately. You'll have to update them in the database. --nate >>> >>> Hi Nate, >>> >>> Sorry for the delay - I must have missed your reply. >>> >>> No, we're not using nginx here. >>> >>> What should I edit in the database? Presumably rather than deleting >>> these jobs I should set the state to finished with error? >>> >>> (Is there any documentation about the Galaxy database schema, >>> and the values of fields in it - or is that all considered to be an >>> internal detail?) >> >> Sorry to nag - my zombie jobs are still there and I'd like a little >> guidance about how to delete them (e.g. which tables and what >> status should I change them to). >> > > Hi Peter > > We had one such job which kept showing up in the "Admin/Manage jobs" page. > > Have a look a the 'job' table. The 'state' is probably: "upload" for yours. > > change the state to "error" - this was the solution for us. > > As always, be very careful when you directly access the MySQL or PostgreSQL > database. Thanks Hans, Indeed, it'd be: update job set state='error' where state='upload'; But be aware that this would take out any current uploads (if there are any), so it may be useful to limit it to jobs older than a certain date or id, e.g.: update job set state='error' where state='upload' and id<123456; --nate > > > Hope this helps > Regards, Hans > > > > > >> Thanks, >> >> Peter >> >> ___ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/ > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Stalled upload jobs under "Admin", "Manage jobs"
On 04/26/2012 04:47 PM, Peter Cock wrote: On Fri, Mar 16, 2012 at 11:00 AM, Peter Cock wrote: On Mon, Feb 13, 2012 at 5:02 PM, Nate Coraor wrote: On Feb 10, 2012, at 6:47 AM, Peter Cock wrote: Hello all, I've noticed we have about a dozen stalled upload jobs on our server from several users. e.g. Job IDUserLast Update ToolState Command LineJob Runner PID/Cluster ID 2352 21 hours agoupload1 upload NoneNoneNone ... 2339 19 hours agoupload1 upload NoneNoneNone The job numbers are consecutive (2339 to 2352) and reflect a problem for a couple of hours yesterday morning. I believe this was due to the underlying file system being unmounted (without restarting Galaxy), and at the time restarting Galaxy fixed uploading files. Test jobs since then have completed normally - but these zombie jobs remain. Using the "Stop jobs" option does not clear these dead upload jobs. Restarting the Galaxy server does not clear them either. This is our production server and was running galaxy-dist, changeset 5743:720455407d1c - which I have now updated to the current release, 6621:26920e20157f - which makes no difference to these stalled jobs. Does anyone have any insight into what might be wrong, and how to get rid of these zombie tasks? Hi Peter, Are you using the nginx upload module? There's no way to fix these from within Galaxy, unfortunately. You'll have to update them in the database. --nate Hi Nate, Sorry for the delay - I must have missed your reply. No, we're not using nginx here. What should I edit in the database? Presumably rather than deleting these jobs I should set the state to finished with error? (Is there any documentation about the Galaxy database schema, and the values of fields in it - or is that all considered to be an internal detail?) Sorry to nag - my zombie jobs are still there and I'd like a little guidance about how to delete them (e.g. which tables and what status should I change them to). Hi Peter We had one such job which kept showing up in the "Admin/Manage jobs" page. Have a look a the 'job' table. The 'state' is probably: "upload" for yours. change the state to "error" - this was the solution for us. As always, be very careful when you directly access the MySQL or PostgreSQL database. Hope this helps Regards, Hans Thanks, Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Stalled upload jobs under "Admin", "Manage jobs"
On Fri, Mar 16, 2012 at 11:00 AM, Peter Cock wrote: > On Mon, Feb 13, 2012 at 5:02 PM, Nate Coraor wrote: >> On Feb 10, 2012, at 6:47 AM, Peter Cock wrote: >> >>> Hello all, >>> >>> I've noticed we have about a dozen stalled upload jobs on our server >>> from several users. e.g. >>> >>> Job ID User Last Update Tool State Command Line Job >>> Runner PID/Cluster ID >>> 2352 21 hours ago upload1 upload None None None >>> ... >>> 2339 19 hours ago upload1 upload None None None >>> >>> The job numbers are consecutive (2339 to 2352) and reflect a problem >>> for a couple of hours yesterday morning. I believe this was due to the >>> underlying file system being unmounted (without restarting Galaxy), >>> and at the time restarting Galaxy fixed uploading files. Test jobs >>> since then have completed normally - but these zombie jobs remain. >>> >>> Using the "Stop jobs" option does not clear these dead upload jobs. >>> >>> Restarting the Galaxy server does not clear them either. >>> >>> This is our production server and was running galaxy-dist, changeset >>> 5743:720455407d1c - which I have now updated to the current release, >>> 6621:26920e20157f - which makes no difference to these stalled jobs. >>> >>> Does anyone have any insight into what might be wrong, and how to get >>> rid of these zombie tasks? >> >> Hi Peter, >> >> Are you using the nginx upload module? >> >> There's no way to fix these from within Galaxy, unfortunately. >> You'll have to update them in the database. >> >> --nate > > Hi Nate, > > Sorry for the delay - I must have missed your reply. > > No, we're not using nginx here. > > What should I edit in the database? Presumably rather than deleting > these jobs I should set the state to finished with error? > > (Is there any documentation about the Galaxy database schema, > and the values of fields in it - or is that all considered to be an > internal detail?) Sorry to nag - my zombie jobs are still there and I'd like a little guidance about how to delete them (e.g. which tables and what status should I change them to). Thanks, Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Stalled upload jobs under "Admin", "Manage jobs"
On Mon, Feb 13, 2012 at 5:02 PM, Nate Coraor wrote: > On Feb 10, 2012, at 6:47 AM, Peter Cock wrote: > >> Hello all, >> >> I've noticed we have about a dozen stalled upload jobs on our server >> from several users. e.g. >> >> Job ID User Last Update Tool State Command Line Job >> Runner PID/Cluster ID >> 2352 21 hours ago upload1 upload None None None >> ... >> 2339 19 hours ago upload1 upload None None None >> >> The job numbers are consecutive (2339 to 2352) and reflect a problem >> for a couple of hours yesterday morning. I believe this was due to the >> underlying file system being unmounted (without restarting Galaxy), >> and at the time restarting Galaxy fixed uploading files. Test jobs >> since then have completed normally - but these zombie jobs remain. >> >> Using the "Stop jobs" option does not clear these dead upload jobs. >> >> Restarting the Galaxy server does not clear them either. >> >> This is our production server and was running galaxy-dist, changeset >> 5743:720455407d1c - which I have now updated to the current release, >> 6621:26920e20157f - which makes no difference to these stalled jobs. >> >> Does anyone have any insight into what might be wrong, and how to get >> rid of these zombie tasks? > > Hi Peter, > > Are you using the nginx upload module? > > There's no way to fix these from within Galaxy, unfortunately. > You'll have to update them in the database. > > --nate Hi Nate, Sorry for the delay - I must have missed your reply. No, we're not using nginx here. What should I edit in the database? Presumably rather than deleting these jobs I should set the state to finished with error? (Is there any documentation about the Galaxy database schema, and the values of fields in it - or is that all considered to be an internal detail?) Thanks, Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Stalled upload jobs under "Admin", "Manage jobs"
On Feb 10, 2012, at 6:47 AM, Peter Cock wrote: > Hello all, > > I've noticed we have about a dozen stalled upload jobs on our server > from several users. e.g. > > Job IDUserLast Update ToolState Command LineJob > Runner PID/Cluster ID > 2352 21 hours agoupload1 upload NoneNoneNone > ... > 2339 19 hours agoupload1 upload NoneNoneNone > > The job numbers are consecutive (2339 to 2352) and reflect a problem > for a couple of hours yesterday morning. I believe this was due to the > underlying file system being unmounted (without restarting Galaxy), > and at the time restarting Galaxy fixed uploading files. Test jobs > since then have completed normally - but these zombie jobs remain. > > Using the "Stop jobs" option does not clear these dead upload jobs. > > Restarting the Galaxy server does not clear them either. > > This is our production server and was running galaxy-dist, changeset > 5743:720455407d1c - which I have now updated to the current release, > 6621:26920e20157f - which makes no difference to these stalled jobs. > > Does anyone have any insight into what might be wrong, and how to get > rid of these zombie tasks? Hi Peter, Are you using the nginx upload module? There's no way to fix these from within Galaxy, unfortunately. You'll have to update them in the database. --nate > > Thanks, > > Peter > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Stalled upload jobs under "Admin", "Manage jobs"
Hello all, I've noticed we have about a dozen stalled upload jobs on our server from several users. e.g. Job ID UserLast Update ToolState Command LineJob Runner PID/Cluster ID 235221 hours agoupload1 upload NoneNoneNone ... 233919 hours agoupload1 upload NoneNoneNone The job numbers are consecutive (2339 to 2352) and reflect a problem for a couple of hours yesterday morning. I believe this was due to the underlying file system being unmounted (without restarting Galaxy), and at the time restarting Galaxy fixed uploading files. Test jobs since then have completed normally - but these zombie jobs remain. Using the "Stop jobs" option does not clear these dead upload jobs. Restarting the Galaxy server does not clear them either. This is our production server and was running galaxy-dist, changeset 5743:720455407d1c - which I have now updated to the current release, 6621:26920e20157f - which makes no difference to these stalled jobs. Does anyone have any insight into what might be wrong, and how to get rid of these zombie tasks? Thanks, Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/