---------- Forwarded message ----------
From: Wenkai Wang (Kevin) <wenkai_w...@shbiochip.com>
Date: Thu, May 11, 2017 at 8:01 PM
Subject: Galaxy Error while Using PBS


I am an engineer at a National Research and Engineering Center in Shanghai.
Recently I am trying the outstanding Galaxy system developed by your team.
The Galaxy system is great; however, there are some errors when Galaxy is
connected to Torque PBS, about which errors I have to ask for help and
advice from you.

When I tried to execute a tool for output like "Hello Galaxy" from within
the webpage, it was shown in webpage that "Unable to run this job due to a
cluster error, please retry it later" (ref.: line 321 of file
galaxy/lib/galaxy/jobs/runners/pbs.py). Meanwhile, the STDOUT of Linux
shell displays information including,

galaxy.jobs.runners.pbs DEBUG 2017-05-12 11:41:47,459 (25) submitting file
/data3/wangwk/app/galaxy/database/pbs/25.sh
galaxy.jobs.runners.pbs WARNING 2017-05-12 11:41:47,460 (25) pbs_submit
failed (try 1/5), PBS error 15033: No free connections
......
galaxy.jobs.runners.pbs WARNING 2017-05-12 11:41:55,470 (25) pbs_submit
failed (try 5/5), PBS error 15033: No free connections
galaxy.jobs.runners.pbs ERROR 2017-05-12 11:41:57,473 (25) All attempts to
submit job failed
galaxy.model.metadata DEBUG 2017-05-12 11:41:57,574 Cleaning up external
metadata files
galaxy.model.metadata DEBUG 2017-05-12 11:41:57,604 Failed to cleanup
MetadataTempFile temp files from /data3/wangwk/app/galaxy/
database/jobs_directory/000/25/metadata_out_HistoryDatasetAssociation_26_wrVwaf:
No JSON object could be decoded

Further information: (1) Linux version: RHEL 6.2, 2.6.32-220.el6.x86_64;
(2) Torque PBS 4.2; (3) pbs-python: 4.4.1.2 (pbs-python 4.4.2.1 requires
CentOS 7 with PBS Torque 5, and thus was not chosen); (4) glibc 2.12; (5)
munge service has been installed, configured, and started on the master
node and compute nodes of the cluster; (6) job_conf.xml file is attached
with this email.

Also, I tried sara_nodes.py from webpage https://oss.trac.surfsara.nl/
pbs_python/wiki/TorqueExamples . The error information is,

Traceback (most recent call last):
  File "./sara_nodes.py", line 644, in <module>
    print_overview_normal(args.nodes)
  File "./sara_nodes.py", line 298, in print_overview_normal
    matched, rest = print_get_nodes(hosts)
  File "./sara_nodes.py", line 199, in print_get_nodes
    pbsq         = PBSQuery.PBSQuery()
  File "/home/wangwk/app/anaconda2/lib/python2.7/site-packages/pbs/PBSQuery.py",
line 137, in __init__
    self.job_server_id = list(self.get_serverinfo())[0]
IndexError: list index out of range

The error above means function self.get_serverinfo() returns nothing and
thus the size of list created based on it is zero. This might be useful for
analysis.

It would be deeply appreciated if you could offer some guide regarding how
to further track or address this problem.

Thanks a lot!


Best Wishes,
Wenkai (Kevin)

------------------------------
Wenkai Wang, National Research and Engineering Center for Biochip
Email: wenkai_w...@shbiochip.com
Email: ww...@cantab.net
<?xml version="1.0"?>
<!-- Kevin: trying to use PBS. For long_jobs, ppn=8 has been changed into ppn=1 while debugging. -->
<!-- Kevin:(a)added Resource_List.walltime for pbs_deafult; -->
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="main"/>
    </handlers>
    <!--previously: destinations default="pbs_default"-->
    <destinations default="blast">
        <destination id="blast" runner="pbs">
            <param id="-q">blast</param>
            <param id="Resource_List">walltime=72:00:00,nodes=1:ppn=1</param>
        </destination>
        <destination id="pbs_default" runner="pbs">
            <param id="-q">blast</param>
            <param id="Resource_List">walltime=72:00:00,nodes=1:ppn=1</param>
        </destination>
        <destination id="other_cluster" runner="pbs">
            <param id="-q">blast</param>
            <param id="destination">@other.cluster</param>
        </destination>
        <destination id="long_jobs" runner="pbs">
            <param id="-q">blast</param>
            <param id="Resource_List">walltime=72:00:00,nodes=1:ppn=1</param>
            <param id="-p">128</param>
        </destination>
    </destinations>
</job_conf>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/

Reply via email to