Hello all,

Our local Galaxy server had been running happily under SGE, using
one of the last free releases (not sure exactly which - I could ask).
Due to concerns about long term maintenance, the SysAdmin has
moved us to an SGE compatible setup - Univa Grid Engine (UGE).

However, in at least one respect this is not a drop in replacement,
while other cluster usage appears to be working fine our Galaxy
installation is not, e.g.

galaxy.jobs.runners.drmaa DEBUG 2013-01-15 17:14:33,660 (331:842)
submitting file
/mnt/galaxy/galaxy-central/database/pbs/galaxy_331:842.sh
galaxy.jobs.runners.drmaa DEBUG 2013-01-15 17:14:33,661 (331:842)
command is: /mnt/galaxy/galaxy-central/extract_dataset_parts.sh
/mnt/galaxy/galaxy-central/database/job_working_directory/000/331/task_0;
blastp -query 
"/mnt/galaxy/galaxy-central/database/job_working_directory/000/331/task_0/dataset_344.dat"
  -db "/var/local/blast/ncbi/nr" -task blastp -evalue 0.001 -out
/mnt/galaxy/galaxy-central/database/job_working_directory/000/331/task_0/dataset_373.dat
    -outfmt 5 -num_threads 8
galaxy.jobs.runners.drmaa ERROR 2013-01-15 17:14:33,666 Uncaught
exception queueing job
Traceback (most recent call last):
  File "/mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py",
line 146, in run_next
    self.queue_job( obj )
  File "/mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py",
line 234, in queue_job
    job_id = self.ds.runJob(jt)
  File 
"/mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py",
line 331, in runJob
    _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate)
  File "/mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py",
line 213, in c
    return f(*(args + (error_buffer, sizeof(error_buffer))))
  File "/mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py",
line 90, in error_check
    raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value))
DeniedByDrmException: code 17: error: no suitable queues

Debugging this by attempting a manual submission,

$ qsub /mnt/galaxy/galaxy-central/database/pbs/galaxy_331:842.sh
Unable to run job: Colon (':') not allowed in objectname.
Exiting.

Renaming the file to replace the colon with (say) an underscore allows
a manual qsub to work fine with UGE. I've edited Galaxy to avoid the
colons (patch below) but the submission still fails.

Additionally removing the SGE specific settings in universe_wsgi.ini did
allow the job to be submitted I am still having problems. Perhaps I need
to fix all the other filenames too (e.g. stdout, stderr, error code), or do that
in one go by removing the colon in the job name?

Has anyone else tried Galaxy under UGE, and do you have any advice?

Thanks,

Peter

-- 

Quick filename hack to avoid colons in job script filenames - might
be better to avoid this in the job name itself?

$ hg diff
diff -r 1bfe2768026a lib/galaxy/jobs/runners/drmaa.py
--- a/lib/galaxy/jobs/runners/drmaa.py  Mon Jan 14 17:21:25 2013 +0000
+++ b/lib/galaxy/jobs/runners/drmaa.py  Tue Jan 15 18:44:31 2013 +0000
@@ -191,7 +191,7 @@
         job_name = ''.join( map( lambda x: x if x in ( string.letters
+ string.digits + '_' ) else '_', job_name ) )

         jt = self.ds.createJobTemplate()
-        jt.remoteCommand = "%s/galaxy_%s.sh" %
(self.app.config.cluster_files_directory, job_wrapper.get_id_tag())
+        jt.remoteCommand = ("%s/galaxy_%s.sh" %
(self.app.config.cluster_files_directory,
job_wrapper.get_id_tag())).replace(":", "_")
         jt.jobName = job_name
         jt.outputPath = ":%s" % ofile
         jt.errorPath = ":%s" % efile
@@ -229,6 +229,7 @@

         log.debug("(%s) submitting file %s" % ( galaxy_id_tag,
jt.remoteCommand ) )
         log.debug("(%s) command is: %s" % ( galaxy_id_tag, command_line ) )
+        log.debug("(%s) spec: %s" % ( galaxy_id_tag, native_spec))
         # runJob will raise if there's a submit problem
         if self.external_runJob_script is None:
             job_id = self.ds.runJob(jt)
@@ -423,7 +424,7 @@
         drm_job_state.ofile = "%s.drmout" % os.path.join(os.getcwd(),
job_wrapper.working_directory, job_wrapper.get_id_tag())
         drm_job_state.efile = "%s.drmerr" % os.path.join(os.getcwd(),
job_wrapper.working_directory, job_wrapper.get_id_tag())
         drm_job_state.ecfile = "%s.drmec" % os.path.join(os.getcwd(),
job_wrapper.working_directory, job_wrapper.get_id_tag())
-        drm_job_state.job_file = "%s/galaxy_%s.sh" %
(self.app.config.cluster_files_directory, job.get_id())
+        drm_job_state.job_file = ("%s/galaxy_%s.sh" %
(self.app.config.cluster_files_directory, job.get_id())).replace(":",
"_")
         drm_job_state.job_id = str( job_id )
         drm_job_state.runner_url = job_wrapper.get_job_runner_url()
         job_wrapper.command_line = job.get_command_line()
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to