Hi John,
Thanks for the reply, I will add the logging as soon as I can and let you know
what I find. I have tried updating pulsar to the latest release 0.8.0 and same
error - so lets see what the logging suggests.
It does strike me as potentially some sort of python issue - so your right, an
upgrade of Galaxy itself to v16 may be in order……..I am just hesitating as my
Galaxy server is quite ‘personalised’ and updating to v16 may require a lot of
other changes.
Cheers and get back to you on this one again asap,
Rich
> On 31 Oct 2016, at 20:24, John Chilton wrote:
>
> Glad this almost worked - I'm not sure what the problem is. I'd open
> the file /cluster/galaxy/pulsar/pulsar/managers/util/drmaa/__init__.py
> - and add some logging right before this line - (return
> self.session.jobStatus(str(external_job_id))).
>
> log.info("Fetching job status for %s" % external_job_id)
>
> or something like that. See if the ID matches something that was in
> your queuing software. It might have some extra prefix or something
> that we can strip off.
>
> It would be also interesting to try Pulsar 0.7.3 against Galaxy 16.07
> - this may be caused by a problem that has been fixed.
>
> -John
>
>
> On Thu, Oct 13, 2016 at 2:06 PM, Poole, Richard wrote:
>> Hey John,
>>
>> So I’ve been happily using Pulsar to send all my Galaxy server jobs to our
>> cluster here at UCL for several months now (I love it!). I am now exploring
>> the ‘run-as-real-user’ option for DRMAA submissions and have run into a
>> problem. The files are correctly staged, correctly chowned, successfully
>> submitted to the queue and the job runs. However, at job end (collection?)
>> fails with the following error message in Pulsar:
>>
>> Exception happened during processing of request from (‘*.*.*.*', 54321)
>> Traceback (most recent call last):
>> File
>> "/opt/rocks/lib/python2.6/site-packages/Paste-2.0.1-py2.6.egg/paste/httpserver.py",
>> line 1072, in process_request_in_thread
>>self.finish_request(request, client_address)
>> File "/opt/rocks/lib/python2.6/SocketServer.py", line 322, in
>> finish_request
>>self.RequestHandlerClass(request, client_address, self)
>> File "/opt/rocks/lib/python2.6/SocketServer.py", line 617, in __init__
>>self.handle()
>> File
>> "/opt/rocks/lib/python2.6/site-packages/Paste-2.0.1-py2.6.egg/paste/httpserver.py",
>> line 446, in handle
>>BaseHTTPRequestHandler.handle(self)
>> File "/opt/rocks/lib/python2.6/BaseHTTPServer.py", line 329, in handle
>>self.handle_one_request()
>> File
>> "/opt/rocks/lib/python2.6/site-packages/Paste-2.0.1-py2.6.egg/paste/httpserver.py",
>> line 441, in handle_one_request
>>self.wsgi_execute()
>> File
>> "/opt/rocks/lib/python2.6/site-packages/Paste-2.0.1-py2.6.egg/paste/httpserver.py",
>> line 291, in wsgi_execute
>>self.wsgi_start_response)
>> File "/cluster/galaxy/pulsar/pulsar/web/framework.py", line 39, in
>> __call__
>>return controller(environ, start_response, **request_args)
>> File "/cluster/galaxy/pulsar/pulsar/web/framework.py", line 144, in
>> controller_replacement
>>result = self.__execute_request(func, args, req, environ)
>> File "/cluster/galaxy/pulsar/pulsar/web/framework.py", line 124, in
>> __execute_request
>>result = func(**args)
>> File "/cluster/galaxy/pulsar/pulsar/web/routes.py", line 82, in status
>>return status_dict(manager, job_id)
>> File "/cluster/galaxy/pulsar/pulsar/manager_endpoint_util.py", line 12, in
>> status_dict
>>job_status = manager.get_status(job_id)
>> File "/cluster/galaxy/pulsar/pulsar/managers/stateful.py", line 95, in
>> get_status
>>proxy_status, state_change = self.__proxy_status(job_directory, job_id)
>> File "/cluster/galaxy/pulsar/pulsar/managers/stateful.py", line 115, in
>> __proxy_status
>>proxy_status = self._proxied_manager.get_status(job_id)
>> File
>> "/cluster/galaxy/pulsar/pulsar/managers/queued_external_drmaa_original.py",
>> line 62, in get_status
>>external_status = super(ExternalDrmaaQueueManager,
>> self)._get_status_external(external_id)
>> File "/cluster/galaxy/pulsar/pulsar/managers/base/base_drmaa.py", line 31,
>> in _get_status_external
>>drmaa_state = self.drmaa_session.job_status(external_id)
>> File "/cluster/galaxy/pulsar/pulsar/managers/util/drmaa/__init__.py", line
>> 50, in job_status
>>return self.session.jobStatus(str(external_job_id))
>> File "build/bdist.linux-x86_64/egg/drmaa/session.py", line 518, in
>> jobStatus
>>c(drmaa_job_ps, jobId, byref(status))
>> File "build/bdist.linux-x86_64/egg/drmaa/helpers.py", line 299, in c
>>return f(*(args + (error_buffer, sizeof(error_buffer
>> File "build/bdist.linux-x86_64/egg/drmaa/errors.py", line 151, in
>> error_check
>>raise _ERRORS[code - 1](error_string)
>> InvalidJobException: code 18: The job specified by the 'jobid' does not
>> exist.
>>
>> With this corresponding error from my Galaxy server:
>>
>> galaxy.tools.actions INFO 2016-10-13 18:47:51