Re: [galaxy-dev] set_environment_for_install problem, seeking for ideas
Hi John, Perl complicates things, TPP complicates things greatly. So true, so true ... Bjoern, can I ask you if this hypothetical exhibits the same problem and can be used to reason about these things more easily and drive a test implementation. Yes to both questions :) So right now, Galaxy has setup_virtualenv which will build and install Python packages in a virtual environment. However, some Python packages have C library dependencies that could prevent them from being installed. As a specific example - take PyTables (install via pip install tables) - which is a package for managing hierarchical datasets. If you try to install this with pip the way Galaxy will - it will fail if you do not have libhdf5 installed. So at a high-level, it would be nice if the tool shed had a libhdf5 definition and the dependencies file had some mechanism for declaring libhdf5 should be installed before a setup_virtualenv containing tables and its environment configured properly so the pip install succeeds (maybe just LD_LIBRARY_PATH needs to be set). Indeed, same problem. I think we have this problem in every high-level install methodm because set_environment_for_install is not allowed as first action tag. Can you think of any case where ENV vars can conflict with each other, besides set_to, and assuming that we source every env.sh file by default for every specified package. Cheers, Bjoern -John On Tue, Nov 5, 2013 at 3:35 PM, Björn Grüning bjoern.gruen...@pharmazie.uni-freiburg.de wrote: Hi Greg, Hello Bjoern, On Nov 5, 2013, at 12:13 PM, Bjoern Gruening bjoern.gruen...@gmail.com wrote: Hi Greg, I'm right now in implementing a setup_perl_environment and stumbled about a tricky problem (that is not only related to perl but also for ruby, python and R). The Problem: Lets assume a perl package (A) requires a xml parser written in C/C++ (Z). (Z) is a dependency that I can import but as far as I can see there is no way to call set_environment_for_install before setup_perl_environment, because setup_perl_environment defines an installation type. The above is fairly difficult to understand - can you provide an actual xml recipe that provides some context? Attached, please see a detailed explanation at the bottom. I would like to discuss that issue to get a few ideas. I can think about these solutions: - hackish solution: I can call install_environment.add_env_shell_file_paths( action_dict[ 'env_shell_file_paths' ] ) inside of the setup_*_environment path and remove it from action type afterwards Again, it's difficult to provide good feedback on the above approach without an example recipe. However, your hackish solution term probably means it is not ideal. ;) :) - import all env.sh variables from every (package) definition. Regardless if set_environment_for_install is set or not. I don't think the above approach would be ideal. It seems that it could fairly easily create conflicting environment variables within a single installation, so the latest value for an environment variable may not be what is expected. What means conflicting ENV vars, I only can imagine multiple set_to that overwrite each other. append_to and prepend_to should be save or? I must admit, I do not understand why set_environment_for_install is actually needed. I think we can assume that if I specify a package name=R_3_0_1 version=3.0.1 repository name=package_r_3_0_1 owner=iuc prior_installation_required=True / /package I want the ENV vars sourced. Hmmm…so you are saying that you want the be able to define the above package tag set inside of an actions tag set and have everything work? Oh no, I mean just have it as package like it is but source the env.sh file for every other package set automatically. So you do not need set_environment_for_install. In the attached example: package name=expat version=2.1.0 repository changeset_revision=8fc96166cddd name=package_expat_2_1 owner=iuc prior_installation_required=True toolshed=http://testtoolshed.g2.bx.psu.edu; / /package Is not imported with set_environment_for_install so its actually useless (now). But the env.sh needs to be sourced during the setup_perl_environment part. I think this may cause problems because I believe the set_environment_for_install tag set restricts activity to only the time when a dependent repository will be using the defined environment from the required repository in order to compile one or more of it's dependencies. Eliminating this restriction may cause problems after compilation. ALthough I cannot state this as a definite fact. Furthermore, that can solve an other issue: Namely, the need of ENV vars from a package definition in the same file. Lets imagine package P has dependency D and you want to download
Re: [galaxy-dev] Supporting file sets for running a tool with multiple input files
Hi Dannon, Thanks for your reply. I've found a workaround by using the method Binary.register_sniffable_binary_format() . I discovered this workaround in a previous thread by John Chilton. Attached the complete solution, just for the record. Regards, Pieter. From: Dannon Baker [mailto:dannon.ba...@gmail.com] Sent: maandag 4 november 2013 13:42 To: Lukasse, Pieter Cc: galaxy-...@bx.psu.edu Subject: Re: [galaxy-dev] Supporting file sets for running a tool with multiple input files Hi Pieter, We've worked out what we think is the right way to solve this for Galaxy and expect work to start soon. See the trello card (https://trello.com/c/325AXIEr/613-tools-dataset-collections) for more details. For your particular tool, the first workaround that comes to mind would be adding a new datatype, say, ZippedInputFiles in your toolshed repository that gets included and used by users, though I haven't actually tried that. That said, I'd probably wait, this feature is high on our list of things to do next. -Dannon On Mon, Nov 4, 2013 at 5:44 AM, Lukasse, Pieter pieter.luka...@wur.nlmailto:pieter.luka...@wur.nl wrote: Hi, Is there any news regarding support for the following scenario in Galaxy: - User has N files which he would like to process with a Galaxy tool using the same parameters - User uploads a (.tar or .zip ?) file to Galaxy and selects this as the input file for the tool - Tool produces an output .zip file with the N result files I know Galaxy-P had a workaround for this some time ago. But has this been solved in the main Galaxy code base? Or are there any feasible workarounds that I can add to my Toolshed package to ensure my .zip file does not get unzipped at upload (default Galaxy behaviour)? Thanks and regards, Pieter Lukasse Wageningen UR, Plant Research International Departments of Bioscience and Bioinformatics Wageningen Campus, Building 107, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands +31-317480891tel:%2B31-317480891; skype: pieter.lukasse.wur http://www.pri.wur.nlhttp://www.pri.wur.nl/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ prims_masscomb_datatypes.py Description: prims_masscomb_datatypes.py ?xml version=1.0? datatypes datatype_files datatype_file name=prims_masscomb_datatypes.py/ /datatype_files registration display_path=display_applications datatype extension=prims.fileset.zip type=galaxy.datatypes.prims_masscomb_datatypes:FileSet display_in_upload=true/ /registration /datatypes___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Galaxy dropping jobs?
Hi again, The loop (as explained below), did the job :) Nikolay Thank you very much, Nate, 1. I have put a fix : a loop running the JobStatus check 5 times every 60 secs and only then throwing an exception as the one below. It happened that all connect failures happen at the same time - at slurm log rotation time at 3 am. Hopefully it helps :) 2. Our slurm conf keeps the info about each job for 5 min. But looking at the code, it seems that in in the case you describe below, there will be an Invalid job exception leading to Job finished state. Am I wrong? Anyway, I'll let you know if the loop does the job. Thanks again, Nikolay On 2013-11-04 15:57, Nate Coraor wrote: Hi Nikolay, With slurm, the following change that I backed out should fix the problem: https://bitbucket.org/galaxy/galaxy-central/diff/lib/galaxy/jobs/runners/drmaa.py?diff2=d46b64f12c52at=default Although I do believe that if Galaxy doesn't read the completion state before slurm forgets about the job (MinJobAge in slurm.conf), this change could result in the job becoming permanently stuck in the running state. I should have some enhancements to the DRMAA runner for slurm coming soon that would prevent this. --nate On Oct 31, 2013, at 5:27 AM, Nikolai Vazov wrote: Hi, I discovered a weird issue in the job behaviour : Galaxy is running a long job on a cluster (more than 24h), about 15 hours later it misses the connection with SLURM on the cluster and throws the following message : [root@galaxy-prod01 galaxy-dist]# grep 3715200 paster.log galaxy.jobs.runners.drmaa INFO 2013-10-30 10:51:54,149 (555) queued as 3715200 galaxy.jobs.runners.drmaa DEBUG 2013-10-30 10:51:55,149 (555/3715200) state change: job is queued and active galaxy.jobs.runners.drmaa DEBUG 2013-10-30 10:52:13,516 (555/3715200) state change: job is running galaxy.jobs.runners.drmaa INFO 2013-10-31 03:29:33,090 (555/3715200) job left DRM queue with following message: code 1: slurm_load_jobs error: Unable to contact slurm controller (connect failure),job_id: 3715200 Is there a timeout in Galaxy for contacting slurm? Yet, the job is still running properly on the cluster ... Thanks for help, it's really urgent :) Nikolay -- Nikolay Vazov, PhD Research Computing Centre - http://hpc.uio.no USIT, University of Oslo ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Nikolay Vazov, PhD Research Computing Centre - http://hpc.uio.no USIT, University of Oslo ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Dynamic tool configuration
Hi all, We are working on a galaxy tool suite for data analysis. We use a sqlite db to keep result data centralised between the different tools. At one point the tool configuration options of a tool should be dependent on the rows within a table of the sqlite db that is the output of the previous step. In other words, we would like to be able to set selectable parameters based on an underlying sql statement. If sql is not possible, an alternative would be to output the table content into a txt file and subsequently parse the txt file instead of the sqlite_db within the xml configuration file. When looking through the galaxy wiki and mailing lists I came across the code tag which would be ideal, we could run a python script in the background to fetch date from the sqlite table, however that function is deprecated. Does anybody know of other ways to achieve this? Thanks! Jeroen Ir. Jeroen Crappé, PhD Student Lab of Bioinformatics and Computational Genomics (Biobix) FBW - Ghent University ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] latest galaxy-central version
Carl, Thanks for building a great tool. Can you fix the library performance problem? Steps to reproduce: 1. http://su2c-dev.ucsc.edu/library 2. open library TCGA Pancan. It takes more than 20 seconds! -Robert On Nov 5, 2013, at 1:29 PM, Carl Eberhard carlfeberh...@gmail.com wrote: Thanks for the thorough report, Robert. I've added some sanity checking to the panel in default:11240:c271cdb443c6 My only guess as to why the panel javascript is trying to load at all in the library section is a link to the controller/datasets methods (unhide, delete, etc.) that still assume histories are in their own frames. Still investigating. On Tue, Nov 5, 2013 at 12:06 PM, Robert Baertsch baert...@soe.ucsc.edu wrote: Carl, Just to make sure I just deleted my tree and did a fresh checkout and tried a third time. It happened all three times. I'm using a postgres database and this time I started with a fresh config files and just changed the setting below. I didn't see the javascript error in Chrome, perhaps it is some strangeness in Firebug. I've seen strange errors before in Firebug. Attached are two screen grab of the lastest try before and after browser refresh. changeset: 11227:151b7d3b2f1b branch: stable tag: tip parent: 11219:5c789ab4144a user:Carl Eberhard carlfeberh...@gmail.com date:Tue Nov 05 11:33:06 2013 -0500 summary: History panel: fix naming bug in purge async error handling -Robert 29c29 port = 8585 --- #port = 8080 34c34 host = 0.0.0.0 --- #host = 127.0.0.1 96d95 database_connection = postgresql://localhost/medbookgalaxycentral3 542c541 allow_library_path_paste = True --- #allow_library_path_paste = False 599c598 admin_users = baert...@soe.ucsc.edu --- #admin_users = None Screen Shot 2013-11-05 at 8.35.09 AM.png Screen Shot 2013-11-05 at 8.37.04 AM.png On Nov 5, 2013, at 8:21 AM, Carl Eberhard carlfeberh...@gmail.com wrote: Hello, Robert I'm having difficulty reproducing this with a fresh install of galaxy_central:default:11226:c67b9518c1e0. Is this an intermittent error or does it happen reliably each time with the steps above? Is it still the same javascript error you mentioned above? I'll investigate further. On Mon, Nov 4, 2013 at 7:29 PM, Robert Baertsch baert...@soe.ucsc.edu wrote: I updated to the stable release and reproduced the issue. Step to reproduce 1. go to admin 2. Manage data libraries 3. add dataset 4. select Upload files from filesystem paths 5. paste full path to any bam file. 6. leave defaults: auto-detect and copy files into galaxy 7. select role to restrict access 8. click upload to start Screen shows strange formatting and Job is running for Bam file. python /data/galaxy-central/tools/data_source/upload.py /data/galaxy-central /data/galaxy-central/database/tmp/tmpJoasJl /data/galaxy-central/database/tmp/tmpzZO8t1 8877:/data/galaxy-central/database/job_working_directory/004/4548/dataset_8877_files:/data/galaxy-central/database/files/008/dataset_8877.dat If I do a firefox refresh and go back to the library, the formatting is normal. I'm assuming the fix is to just render the page before waiting for the job to complete. -Robert On Nov 4, 2013, at 12:45 PM, Martin Čech mar...@bx.psu.edu wrote: Hello, I have also seen some of these errors while developing libraries. The library code is not in central however it might be related to recent changes to the history panel. Carl Eberhard might now more, adding him to the conversation. --Marten On Mon, Nov 4, 2013 at 2:45 PM, Robert Baertsch baert...@soe.ucsc.edu wrote: It keeps doing posts, and I'm not seeing any new errors. POST http://su2c-dev.ucsc.edu:8383/library_common/library_item_updates 200 OK 121ms When I did a browser refresh, I got the following javascript error: (I am logged in) Galaxy.currUser is undefined on Line 631 in history-panel.js When I opened the data library where the bam file was copying, everything is rendered ok. It seems the browser refresh fixed things. -Robert On Nov 4, 2013, at 11:14 AM, James Taylor ja...@jamestaylor.org wrote: Robert, I'm not sure what is going on here, other than that the javascript that converts buttons into dropdown menus has not fired. Are there any javascript errors? Marten is working on rewriting libraries, and we will be eliminating the progressive loading popupmenus for something much more efficient, but this also might indicate a bug so let us know if there is anything odd in the console. -- James Taylor, Associate Professor, Biology/CS, Emory University On Mon, Nov 4, 2013 at 1:58 PM, Robert Baertsch baert...@soe.ucsc.edu wrote: HI James, I just pulled in the latest code to see how you changed from iframe to divs. Very exciting update. I tried importing a bam file into the library using
[galaxy-dev] Dynamic tool configuration
Hi all, We are working on a galaxy tool suite for data analysis. We use a sqlite db to keep result data centralised between the different tools. At one point the tool configuration options of a tool should be dependent on the rows within a table of the sqlite db that is the output of the previous step. In other words, we would like to be able to set selectable parameters based on an underlying sql statement. If sql is not possible, an alternative would be to output the table content into a txt file and subsequently parse the txt file instead of the sqlite_db within the xml configuration file. When looking through the galaxy wiki and mailing lists I came across the code tag which would be ideal, we could run a python script in the background to fetch date from the sqlite table, however that function is deprecated. Does anybody know of other ways to achieve this? Thanks! Jeroen Ir. Jeroen Crappé, PhD Student Lab of Bioinformatics and Computational Genomics (Biobix) FBW - Ghent University ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] set_environment_for_install problem, seeking for ideas
Björn, We're thinking that the following approach makes the most sense: action type=setup_perl_environment OR action type=setup_r_environment OR action type=setup_ruby_environment OR action type=setup_virtualenv repository changeset_revision=978287122b91 name=package_perl_5_18 owner=iuc toolshed=http://testtoolshed.g2.bx.psu.edu; package name=perl version=5.18.1 / /repository repository changeset_revision=8fc96166cddd name=package_expat_2_1 owner=iuc toolshed=http://testtoolshed.g2.bx.psu.edu; package name=expat version=2.1 / /repository /action For all repository tag sets contained within these setup_* tags, the repository's env.sh would be pulled in for the setup of the specified environment without requiring a set_environment_for_install action type. Would this work for your use cases? If so, can you confirm that this should be done for all four currently supported setup_* action types? Based on your response, Greg or I will implement this as soon as possible. --Dave B. On 11/06/2013 03:05 AM, Björn Grüning wrote: Hi John, Perl complicates things, TPP complicates things greatly. So true, so true ... Bjoern, can I ask you if this hypothetical exhibits the same problem and can be used to reason about these things more easily and drive a test implementation. Yes to both questions :) So right now, Galaxy has setup_virtualenv which will build and install Python packages in a virtual environment. However, some Python packages have C library dependencies that could prevent them from being installed. As a specific example - take PyTables (install via pip install tables) - which is a package for managing hierarchical datasets. If you try to install this with pip the way Galaxy will - it will fail if you do not have libhdf5 installed. So at a high-level, it would be nice if the tool shed had a libhdf5 definition and the dependencies file had some mechanism for declaring libhdf5 should be installed before a setup_virtualenv containing tables and its environment configured properly so the pip install succeeds (maybe just LD_LIBRARY_PATH needs to be set). Indeed, same problem. I think we have this problem in every high-level install methodm because set_environment_for_install is not allowed as first action tag. Can you think of any case where ENV vars can conflict with each other, besides set_to, and assuming that we source every env.sh file by default for every specified package. Cheers, Bjoern -John On Tue, Nov 5, 2013 at 3:35 PM, Björn Grüning bjoern.gruen...@pharmazie.uni-freiburg.de wrote: Hi Greg, Hello Bjoern, On Nov 5, 2013, at 12:13 PM, Bjoern Gruening bjoern.gruen...@gmail.com wrote: Hi Greg, I'm right now in implementing a setup_perl_environment and stumbled about a tricky problem (that is not only related to perl but also for ruby, python and R). The Problem: Lets assume a perl package (A) requires a xml parser written in C/C++ (Z). (Z) is a dependency that I can import but as far as I can see there is no way to call set_environment_for_install before setup_perl_environment, because setup_perl_environment defines an installation type. The above is fairly difficult to understand - can you provide an actual xml recipe that provides some context? Attached, please see a detailed explanation at the bottom. I would like to discuss that issue to get a few ideas. I can think about these solutions: - hackish solution: I can call install_environment.add_env_shell_file_paths( action_dict[ 'env_shell_file_paths' ] ) inside of the setup_*_environment path and remove it from action type afterwards Again, it's difficult to provide good feedback on the above approach without an example recipe. However, your hackish solution term probably means it is not ideal. ;) :) - import all env.sh variables from every (package) definition. Regardless if set_environment_for_install is set or not. I don't think the above approach would be ideal. It seems that it could fairly easily create conflicting environment variables within a single installation, so the latest value for an environment variable may not be what is expected. What means conflicting ENV vars, I only can imagine multiple set_to that overwrite each other. append_to and prepend_to should be save or? I must admit, I do not understand why set_environment_for_install is actually needed. I think we can assume that if I specify a package name=R_3_0_1 version=3.0.1 repository name=package_r_3_0_1 owner=iuc prior_installation_required=True / /package I want the ENV vars sourced. Hmmm…so you are saying that you want the be able to define the above package tag set inside of an actions tag set and have everything work? Oh no, I mean just have it as package like it is but source the env.sh file for every other package set automatically. So you do not need set_environment_for_install. In the attached example: package name=expat
[galaxy-dev] Tool Access Control
Howdy devs, I've implemented some rather basic tool access control and am looking for feedback on my implementation. # Why Our organisation wanted the ability to restrict tools to different users/roles. As such I've implemented as an execute tag which can be applied to either section or tools in the tool configuration file. # Example galaxy-admin changes For example: section execute=a...@b.co,b...@b.co id=EncodeTools name=ENCODE Tools tool file=encode/gencode_partition.xml / tool execute=b...@b.co file=encode/random_intervals.xml / /section which would allow A and B to access gencode_parition, but only B would be able to access random_intervals. To put it explicity - by default, everyone can access all tools - if section level permissions are set, then those are set as defaults for all tools in that section - if tool permissions are set, they will override the defaults. # Pros and Cons There are some good features - non-accessible tools won't show up in the left hand panel, based on user - non-accessible tools cannot be run or accessed. There are some caveats however. - existence of tools is not completely hidden. - Labels are not hidden at all. - workflows break completely if a tool is unavailable to a shared user and the user copies+edits. They can be copied, and viewed (says tool not found), but cannot be edited. Tool names/id/version info can be found in the javascript object due to the call to app.toolbox.tool_panel.items() in templates/webapps/galaxy/workflow/editor.mako, as that returns the raw tool list, rather than one that's filtered on whether or not the user has access. I'm yet to figure out a clean fix for this. Additionally, empty sections are still shown even if there aren't tools listed in them. For a brief overview of my changes, please see the attached diff. (It's missing one change because I wasn't being careful and started work on multiple different features) # Changeset overview In brief, most of the changes consist of - new method in model.User to check if an array of roles overlaps at all with a user's roles - modifications to appropriate files for reading in the new tool_config.xml's options - modification to get_tool to pass user information, as whether or not a tool exists is now dependent on who is asking. Please let me know if you have input on this before I create a pull request on this feature. # Fixes I believe this will fix a number of previously brought up issues (at least to my understanding of the issues listed) + https://trello.com/c/Zo7FAXlM/286-24-add-ability-to-password-secure-tools + (I saw some solution where they were adding _beta to tool names which gave permissions to developers somewhere, but cannot find that now) Cheers, Eric Rasche -- Eric Rasche Programmer II Center for Phage Technology Texas AM University College Station, TX 77843 404-692-2048 e...@tamu.edu rasche.e...@yandex.ru diff -r c458a0fe1ba8 lib/galaxy/model/__init__.py --- a/lib/galaxy/model/__init__.py Mon Nov 04 14:56:57 2013 -0500 +++ b/lib/galaxy/model/__init__.py Wed Nov 06 11:18:10 2013 -0600 @@ -114,6 +114,19 @@ roles.append( role ) return roles +def can_execute( self, permissions=None ): + +Check if any of a user's roles overlap with set permissions + +# If permissions variable is NOT set, then allow access (be friendly mode) +if permissions is None: +return True +# Otherwise, we want to check and deny if they're not in the set +for role in self.all_roles(): +if role.name in permissions: +return True +return False + def get_disk_usage( self, nice_size=False ): Return byte count of disk space used by user or a human-readable diff -r c458a0fe1ba8 lib/galaxy/tools/__init__.py --- a/lib/galaxy/tools/__init__.py Mon Nov 04 14:56:57 2013 -0500 +++ b/lib/galaxy/tools/__init__.py Wed Nov 06 11:18:10 2013 -0600 @@ -195,12 +195,19 @@ self.index += 1 if parsing_shed_tool_conf: config_elems.append( elem ) +permissions = None +try: +permissions = elem.get('execute').split(',') +log.debug('Execute section found: %s' % (':'.join(permissions))) +except: +log.debug(No execute section found) +pass if elem.tag == 'tool': -self.load_tool_tag_set( elem, self.tool_panel, self.integrated_tool_panel, tool_path, load_panel_dict, guid=elem.get( 'guid' ), index=index ) +self.load_tool_tag_set( elem, self.tool_panel, self.integrated_tool_panel, tool_path, load_panel_dict, guid=elem.get( 'guid' ), index=index, permissions=permissions ) elif elem.tag == 'workflow': self.load_workflow_tag_set( elem, self.tool_panel, self.integrated_tool_panel, load_panel_dict, index=index ) elif elem.tag ==
Re: [galaxy-dev] Tool Access Control
Hi Eric, please also take a look at this mailing list thread: http://dev.list.galaxyproject.org/pass-user-groups-to-dynamic-job-runner-td4661753.html If you are interested in the is_user_in_group solution, I have a slightly updated version which also uses roles instead of groups. Nicola Il giorno mer, 06/11/2013 alle 11.38 -0600, Eric Rasche ha scritto: Howdy devs, I've implemented some rather basic tool access control and am looking for feedback on my implementation. # Why Our organisation wanted the ability to restrict tools to different users/roles. As such I've implemented as an execute tag which can be applied to either section or tools in the tool configuration file. # Example galaxy-admin changes For example: section execute=a...@b.co,b...@b.co id=EncodeTools name=ENCODE Tools tool file=encode/gencode_partition.xml / tool execute=b...@b.co file=encode/random_intervals.xml / /section which would allow A and B to access gencode_parition, but only B would be able to access random_intervals. To put it explicity - by default, everyone can access all tools - if section level permissions are set, then those are set as defaults for all tools in that section - if tool permissions are set, they will override the defaults. # Pros and Cons There are some good features - non-accessible tools won't show up in the left hand panel, based on user - non-accessible tools cannot be run or accessed. There are some caveats however. - existence of tools is not completely hidden. - Labels are not hidden at all. - workflows break completely if a tool is unavailable to a shared user and the user copies+edits. They can be copied, and viewed (says tool not found), but cannot be edited. Tool names/id/version info can be found in the javascript object due to the call to app.toolbox.tool_panel.items() in templates/webapps/galaxy/workflow/editor.mako, as that returns the raw tool list, rather than one that's filtered on whether or not the user has access. I'm yet to figure out a clean fix for this. Additionally, empty sections are still shown even if there aren't tools listed in them. For a brief overview of my changes, please see the attached diff. (It's missing one change because I wasn't being careful and started work on multiple different features) # Changeset overview In brief, most of the changes consist of - new method in model.User to check if an array of roles overlaps at all with a user's roles - modifications to appropriate files for reading in the new tool_config.xml's options - modification to get_tool to pass user information, as whether or not a tool exists is now dependent on who is asking. Please let me know if you have input on this before I create a pull request on this feature. # Fixes I believe this will fix a number of previously brought up issues (at least to my understanding of the issues listed) + https://trello.com/c/Zo7FAXlM/286-24-add-ability-to-password-secure-tools + (I saw some solution where they were adding _beta to tool names which gave permissions to developers somewhere, but cannot find that now) Cheers, Eric Rasche ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool Access Control
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Nicola, Oh, excellent. I must've skipped over that, given the strange title of the thread. Your solution at the end of that thread is very promising, and certainly handles failure MUCH better than mine does (i.e. raising exceptions and not breaking a workflow if the user isn't permitted access.) (Did you put it on the galaxy wiki anywhere? If it weren't for you linking that, I never would've known about it and that's very useful info!) In my organisation's case; if a user isn't allowed access to a given tool, we believe that - - galaxy has no reason to admit it exists - - galaxy should not share default information about a tool Which is a bit different from the case of having a license to use a tool. For licensing issues, naturally it would be fine to say yes this exists and if you can't run it, obtain a license. For my org's case, we might want to store administrative tools (for other services) in galaxy. It's a very convenient platform for more than just bioinformatics and we have some non-technical people on staff who occasionally need to pull various data sets from various services/make database backups/etc. Students and clients who use our galaxy instance don't need to know that these tools are available. Thoughts? On 11/06/2013 12:12 PM, Nicola Soranzo wrote: Hi Eric, please also take a look at this mailing list thread: http://dev.list.galaxyproject.org/pass-user-groups-to-dynamic-job-runner-td4661753.html If you are interested in the is_user_in_group solution, I have a slightly updated version which also uses roles instead of groups. Nicola Il giorno mer, 06/11/2013 alle 11.38 -0600, Eric Rasche ha scritto: Howdy devs, I've implemented some rather basic tool access control and am looking for feedback on my implementation. # Why Our organisation wanted the ability to restrict tools to different users/roles. As such I've implemented as an execute tag which can be applied to either section or tools in the tool configuration file. # Example galaxy-admin changes For example: section execute=a...@b.co,b...@b.co id=EncodeTools name=ENCODE Tools tool file=encode/gencode_partition.xml / tool execute=b...@b.co file=encode/random_intervals.xml / /section which would allow A and B to access gencode_parition, but only B would be able to access random_intervals. To put it explicity - by default, everyone can access all tools - if section level permissions are set, then those are set as defaults for all tools in that section - if tool permissions are set, they will override the defaults. # Pros and Cons There are some good features - non-accessible tools won't show up in the left hand panel, based on user - non-accessible tools cannot be run or accessed. There are some caveats however. - existence of tools is not completely hidden. - Labels are not hidden at all. - workflows break completely if a tool is unavailable to a shared user and the user copies+edits. They can be copied, and viewed (says tool not found), but cannot be edited. Tool names/id/version info can be found in the javascript object due to the call to app.toolbox.tool_panel.items() in templates/webapps/galaxy/workflow/editor.mako, as that returns the raw tool list, rather than one that's filtered on whether or not the user has access. I'm yet to figure out a clean fix for this. Additionally, empty sections are still shown even if there aren't tools listed in them. For a brief overview of my changes, please see the attached diff. (It's missing one change because I wasn't being careful and started work on multiple different features) # Changeset overview In brief, most of the changes consist of - new method in model.User to check if an array of roles overlaps at all with a user's roles - modifications to appropriate files for reading in the new tool_config.xml's options - modification to get_tool to pass user information, as whether or not a tool exists is now dependent on who is asking. Please let me know if you have input on this before I create a pull request on this feature. # Fixes I believe this will fix a number of previously brought up issues (at least to my understanding of the issues listed) + https://trello.com/c/Zo7FAXlM/286-24-add-ability-to-password-secure-tools + (I saw some solution where they were adding _beta to tool names which gave permissions to developers somewhere, but cannot find that now) Cheers, Eric Rasche ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ - -- Eric Rasche Programmer II Center for Phage Technology Texas AM University
Re: [galaxy-dev] Question regarding walltime exceeded not being correctly reported via the WebUI
Hi, John, Based on my initial testing the application of your patch 2 successfully conveys the job walltime exceeded error to the web UI. As far as I am concerned you resolved this issue perfectly. Thank you so much for your help with this. I will let you know if I experience any additional issues. Respectfully yours, Dan Sullivan On Tue, Nov 5, 2013 at 8:52 PM, John Chilton chil...@msi.umn.edu wrote: On Tue, Nov 5, 2013 at 11:53 AM, Daniel Patrick Sullivan dansulli...@gmail.com wrote: Hi, John, Thank you for taking the time to help me look into this issue. I have applied the patch you provided and confirmed that it appears to help remediate the problem (when a walltime is exceeded feedback is in fact provided via the Galaxy web UI; it no longer appears that jobs are running indefinitely).One thing I would like to note is that the error that is provided to the user is generic, i.e. the web UI reports An error occurred with this dataset: Job cannot be completed due to a cluster error, please retry it later. So, the fact that a Walltime exceeded error actually occurred is not presented to the user (I am not sure if this is intentional or not). Again, I appreciate you taking the time to verify and patch this issue. I have attached a screenshot of the output for your review. Glad we are making progress - I have committed that previous patch to galaxy-central. Lets see if we cannot improve the user feedback so they know they hit the maximum walltime. Can you try this new patch? The message about the timeout was being built but it was not being logged not set as the error message on the dataset - this should resolve that. I am probably going to be testing Galaxy with Torque 4.2.5 in the coming weeks, I will let you know if I identify any additional problems. Thank you so much have a wonderful day. You too, thanks for working with me on fixing this! -John Dan Sullivan On Tue, Nov 5, 2013 at 8:48 AM, John Chilton chil...@msi.umn.edu wrote: Hey Daniel, Thanks so much for the details problem report, it was very helpful. Reviewing the code there appears to be a bug in the PBS job runner - in some cases pbs_job_state.stop_job is never set but is attempted to be read. I don't have torque so I don't have a great test setup for this problem, any chance you can make the following changes for me and let me know if they work? Between the following two lines: log.error( '(%s/%s) PBS job failed: %s' % ( galaxy_job_id, job_id, JOB_EXIT_STATUS.get( int( status.exit_status ), 'Unknown error: %s' % status.exit_status ) ) ) self.work_queue.put( ( self.fail_job, pbs_job_state ) ) log.error( '(%s/%s) PBS job failed: %s' % ( galaxy_job_id, job_id, JOB_EXIT_STATUS.get( int( status.exit_status ), 'Unknown error: %s' % status.exit_status ) ) ) pbs_job_state.stop_job = False self.work_queue.put( ( self.fail_job, pbs_job_state ) ) And at the top of the file can you add a -11 option to the JOB_EXIT_STATUS to indicate a job timeout. I have attached a patch that would apply against the latest stable - it will probably will work against your branch as well. If you would rather not act as my QC layer, I can try to come up with a way to do some testing on my end :). Thanks again, -John On Mon, Nov 4, 2013 at 10:10 AM, Daniel Patrick Sullivan dansulli...@gmail.com wrote: Hi, Galaxy Developers, I have what I hops is somewhat of a basic question regarding Galaxy's interaction with a pbs job cluster and information reported via the webUI. Basically, in certain situations, the walltime of a specific job is exceeded. This is of course to be expected and all fine and understandeable. My problem is that the information is not being relayed back to the end user via the Galaxy web UI, which causes confusion in our Galaxy user community. Basically the Torque scheduler generates the following message when a walltime is exceeded: 11/04/2013 08:39:45;000d;PBS_Server.30621;Job;163.sctest.cri.uchicago.edu ;preparing to send 'a' mail for job 163.sctest.cri.uchicago.edu to s.cri.gal...@crigalaxy-test.uchicago.edu (Job exceeded its walltime limit. Job was aborted 11/04/2013 08:39:45;0009;PBS_Server.30621;Job;163.sctest.cri.uchicago.edu;job exit status -11 handled Now, my problem is that this status -11 return code is not being correctly handled by Galaxy. What happens is that Galaxy throws an exception, specificially: 10.135.217.178 - - [04/Nov/2013:08:39:42 -0500] GET /api/histories/90240358ebde1489 HTTP/1.1 200 - https://crigalaxy-test.uchicago.edu/history; Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0 galaxy.jobs.runners.pbs DEBUG 2013-11-04 08:39:46,137
Re: [galaxy-dev] set_environment_for_install problem, seeking for ideas
Hi Dave, We're thinking that the following approach makes the most sense: action type=setup_perl_environment OR action type=setup_r_environment OR action type=setup_ruby_environment OR action type=setup_virtualenv repository changeset_revision=978287122b91 name=package_perl_5_18 owner=iuc toolshed=http://testtoolshed.g2.bx.psu.edu; package name=perl version=5.18.1 / /repository repository changeset_revision=8fc96166cddd name=package_expat_2_1 owner=iuc toolshed=http://testtoolshed.g2.bx.psu.edu; package name=expat version=2.1 / /repository /action For all repository tag sets contained within these setup_* tags, the repository's env.sh would be pulled in for the setup of the specified environment without requiring a set_environment_for_install action type. Would this work for your use cases? Yes, the first one. But its a little bit to verbose or? Include the perl repository in a setup_perl environment should be implicit or? We can assume that this need to be present. Do you have example why sourcing every repository by default can be harmful? It would make such an installation so much easier and less complex. Also that did not solve the second use case. If have two packages one that is installing perl libraries and the second a binary that is checking or that needs these perl libs. If so, can you confirm that this should be done for all four currently supported setup_* action types? I think it will solve my current issues. Based on your response, Greg or I will implement this as soon as possible. Thanks! Bjoern --Dave B. On 11/06/2013 03:05 AM, Björn Grüning wrote: Hi John, Perl complicates things, TPP complicates things greatly. So true, so true ... Bjoern, can I ask you if this hypothetical exhibits the same problem and can be used to reason about these things more easily and drive a test implementation. Yes to both questions :) So right now, Galaxy has setup_virtualenv which will build and install Python packages in a virtual environment. However, some Python packages have C library dependencies that could prevent them from being installed. As a specific example - take PyTables (install via pip install tables) - which is a package for managing hierarchical datasets. If you try to install this with pip the way Galaxy will - it will fail if you do not have libhdf5 installed. So at a high-level, it would be nice if the tool shed had a libhdf5 definition and the dependencies file had some mechanism for declaring libhdf5 should be installed before a setup_virtualenv containing tables and its environment configured properly so the pip install succeeds (maybe just LD_LIBRARY_PATH needs to be set). Indeed, same problem. I think we have this problem in every high-level install methodm because set_environment_for_install is not allowed as first action tag. Can you think of any case where ENV vars can conflict with each other, besides set_to, and assuming that we source every env.sh file by default for every specified package. Cheers, Bjoern -John On Tue, Nov 5, 2013 at 3:35 PM, Björn Grüning bjoern.gruen...@pharmazie.uni-freiburg.de wrote: Hi Greg, Hello Bjoern, On Nov 5, 2013, at 12:13 PM, Bjoern Gruening bjoern.gruen...@gmail.com wrote: Hi Greg, I'm right now in implementing a setup_perl_environment and stumbled about a tricky problem (that is not only related to perl but also for ruby, python and R). The Problem: Lets assume a perl package (A) requires a xml parser written in C/C++ (Z). (Z) is a dependency that I can import but as far as I can see there is no way to call set_environment_for_install before setup_perl_environment, because setup_perl_environment defines an installation type. The above is fairly difficult to understand - can you provide an actual xml recipe that provides some context? Attached, please see a detailed explanation at the bottom. I would like to discuss that issue to get a few ideas. I can think about these solutions: - hackish solution: I can call install_environment.add_env_shell_file_paths( action_dict[ 'env_shell_file_paths' ] ) inside of the setup_*_environment path and remove it from action type afterwards Again, it's difficult to provide good feedback on the above approach without an example recipe. However, your hackish solution term probably means it is not ideal. ;) :) - import all env.sh variables from every (package) definition. Regardless if set_environment_for_install is set or not. I don't think the above approach would be ideal. It seems that it could fairly easily create conflicting environment variables within a single installation, so the latest value for an environment variable may not be what is expected. What means conflicting ENV vars, I only can imagine
[galaxy-dev] Trouble setting up a local instance of Galaxy
Hello, I am a new user of Galaxy. I have a Galaxy instance running (sort of) on a local research cluster. I issued the command hg update stable” today and it retrieved no files. SO presume I am up-to-date on the stable release. I start the instance as a user named “galaxy”. Right now I am still running in “local” mode. I hope to migrate to DRMAA LSF eventually. I have tried to set up ProFTP to upload files but have not succeeded so I use Galaxy Web-upload. The upload was working nicely and I had added a couple of new tools and they were working with the uploaded files. Getting LSF/DRMAA to work was giving me fits and ultimately I deleted all my history files in an effort to start over. Presently, files being uploaded appear in history as say job 1 ( in a new history) The job status in the history panel of the web GUI changes from purple to yellow then then to red indicating some sort of error. There is no viewable error text captured, but I can click on the “eye” icon and see the first megabyte of the data (for tiny files I can see the entire content and it’s intact). In the Galaxy file system, however, these files appear but have a different number , say, dataset_399.dat On my system the uploaded files appear in /PATH/galaxy-dist/database/files/000 My first question is why is the data going into the “000” subdirectory and not one “owned’ by the user who is uploading? My second question is why is the dataset being labeled as dataset_399.dat and not dataset_001.dat? My third question is why do the uploaded files not appear as selectable options ( say I have paired-end fastq files and tool wants to have choices about filenames)? This problem is present for programs that seek one input file as well. I presume that Galaxy is confused because the numbering in history is not the same as the numbering in the file upload archive (e.g. /PATH/galaxy-dist/database/files/000 in my case) so my last question is how do I “reset” my system to get the dataset and history numbers to be the same? Here’s how I launch the Galaxy instance sh /shared/app/Galaxy/galaxy-dist/run.sh -v --daemon --pid-file=Nov6Localdaemon.pid.txt --log-file=Nov6Local1639daemon.log.txt Entering daemon mode Here are the last lines of the log Starting server in PID 26236. serving on 0.0.0.0:8089 view at http://127.0.0.1:8089 galaxy.tools.actions.upload_common DEBUG 2013-11-06 16:48:49,624 Changing ownership of /shared/app/Galaxy/galaxy-dist/database/tmp/upload_file_data_QZGHm4 with: /usr/bin/sudo -E scripts/external_chown_script.py /shared/app/Galaxy/galaxy-dist/database/tmp/upload_file_data_QZGHm4 hazards 502 galaxy.tools.actions.upload_common WARNING 2013-11-06 16:48:49,750 Changing ownership of uploaded file /shared/app/Galaxy/galaxy-dist/database/tmp/upload_file_data_QZGHm4 failed: sudo: no tty present and no askpass program specified galaxy.tools.actions.upload_common DEBUG 2013-11-06 16:48:49,751 Changing ownership of /shared/app/Galaxy/galaxy-dist/database/tmp/tmpEsyGfO with: /usr/bin/sudo -E scripts/external_chown_script.py /shared/app/Galaxy/galaxy-dist/database/tmp/tmpEsyGfO hazards 502 galaxy.tools.actions.upload_common WARNING 2013-11-06 16:48:49,775 Changing ownership of uploaded file /shared/app/Galaxy/galaxy-dist/database/tmp/tmpEsyGfO failed: sudo: no tty present and no askpass program specified galaxy.tools.actions.upload_common INFO 2013-11-06 16:48:49,805 tool upload1 created job id 170 galaxy.jobs DEBUG 2013-11-06 16:48:50,678 (170) Persisting job destination (destination id: local) galaxy.jobs.handler INFO 2013-11-06 16:48:50,698 (170) Job dispatched galaxy.jobs.runners.local DEBUG 2013-11-06 16:48:50,994 (170) executing: python /shared/app/Galaxy/galaxy-dist/tools/data_source/upload.py /depot/shared/app/Galaxy/galaxy-dist /shared/app/Galaxy/galaxy-dist/database/tmp/tmpTq22ot /shared/app/Galaxy/galaxy-dist/database/tmp/tmpEsyGfO 406:/shared/app/Galaxy/galaxy-dist/database/job_working_directory/000/170/dataset_406_files:/shared/app/Galaxy/galaxy-dist/database/job_working_directory/000/170/galaxy_dataset_406.dat galaxy.jobs DEBUG 2013-11-06 16:48:51,030 (170) Persisting job destination (destination id: local) galaxy.jobs.runners.local DEBUG 2013-11-06 16:48:53,335 execution finished: python /shared/app/Galaxy/galaxy-dist/tools/data_source/upload.py /depot/shared/app/Galaxy/galaxy-dist /shared/app/Galaxy/galaxy-dist/database/tmp/tmpTq22ot /shared/app/Galaxy/galaxy-dist/database/tmp/tmpEsyGfO 406:/shared/app/Galaxy/galaxy-dist/database/job_working_directory/000/170/dataset_406_files:/shared/app/Galaxy/galaxy-dist/database/job_working_directory/000/170/galaxy_dataset_406.dat galaxy.jobs.runners.local DEBUG 2013-11-06 16:48:53,463 executing external set_meta script for job 170: /depot/shared/app/Galaxy/galaxy-dist/set_metadata.sh /shared/app/Galaxy/galaxy-dist/database/files
Re: [galaxy-dev] set_environment_for_install problem, seeking for ideas
My two cents below. On Wed, Nov 6, 2013 at 4:20 PM, Björn Grüning bjoern.gruen...@pharmazie.uni-freiburg.de wrote: Hi Dave, We're thinking that the following approach makes the most sense: action type=setup_perl_environment OR action type=setup_r_environment OR action type=setup_ruby_environment OR action type=setup_virtualenv repository changeset_revision=978287122b91 name=package_perl_5_18 owner=iuc toolshed=http://testtoolshed.g2.bx.psu.edu; package name=perl version=5.18.1 / /repository repository changeset_revision=8fc96166cddd name=package_expat_2_1 owner=iuc toolshed=http://testtoolshed.g2.bx.psu.edu; package name=expat version=2.1 / /repository /action For all repository tag sets contained within these setup_* tags, the repository's env.sh would be pulled in for the setup of the specified environment without requiring a set_environment_for_install action type. Would this work for your use cases? Yes, the first one. But its a little bit to verbose or? Include the perl repository in a setup_perl environment should be implicit or? We can assume that this need to be present. Do you have example why sourcing every repository by default can be harmful? It would make such an installation so much easier and less complex. I am not sure I understand this paragraph - I have a vague sense I agree but is there any chance you could rephrase this or elaborate? Also that did not solve the second use case. If have two packages one that is installing perl libraries and the second a binary that is checking or that needs these perl libs. We have discussed off list in another thread. Just to summarize my thoughts there - I think we should delay this or not make it a priority if there are marginally acceptable workarounds that can be found for the time being. Getting these four actions to work well as sort of terminal endpoints and allow specification as tersely as possible should be the primary goal for the time being. You will see Perl or Python packages depend on C libraries 10 times more frequently than you will find makefiles and C programs depend on complex perl or python environments (correct me if I am wrong). Given that there is already years worth of tool shed development outlined in existing Trello cards - this is just how I would prioritize things (happy to be overruled). If so, can you confirm that this should be done for all four currently supported setup_* action types? I think it would be best to tackle setup_r_environment and setup_ruby_environment first. setup_virtualenv cannot have nested elements at this time - it is just assumed to be a bunch of text (either a file containing the dependencies or a list of the dependencies). So setup_r_environment and setup_ruby_environment have the same structure: setup_ruby_environment repository .. / package .. / package .. / /setup_ruby_environment ... but setup_virtualenv is just setup_virtualenvrequests=1.20 pycurl==1.3/setup_virtualenv I have created a Trello card for this: https://trello.com/c/NsLJv9la (and some other related stuff). Once that is tackled though, it will make sense to allow setup_virtualenv to utilize the same functionality. Thanks all, -John I think it will solve my current issues. Based on your response, Greg or I will implement this as soon as possible. Thanks! Bjoern --Dave B. On 11/06/2013 03:05 AM, Björn Grüning wrote: Hi John, Perl complicates things, TPP complicates things greatly. So true, so true ... Bjoern, can I ask you if this hypothetical exhibits the same problem and can be used to reason about these things more easily and drive a test implementation. Yes to both questions :) So right now, Galaxy has setup_virtualenv which will build and install Python packages in a virtual environment. However, some Python packages have C library dependencies that could prevent them from being installed. As a specific example - take PyTables (install via pip install tables) - which is a package for managing hierarchical datasets. If you try to install this with pip the way Galaxy will - it will fail if you do not have libhdf5 installed. So at a high-level, it would be nice if the tool shed had a libhdf5 definition and the dependencies file had some mechanism for declaring libhdf5 should be installed before a setup_virtualenv containing tables and its environment configured properly so the pip install succeeds (maybe just LD_LIBRARY_PATH needs to be set). Indeed, same problem. I think we have this problem in every high-level install methodm because set_environment_for_install is not allowed as first action tag. Can you think of any case where ENV vars can conflict with each other, besides set_to, and assuming that we source every env.sh file by default for every specified package. Cheers, Bjoern -John On Tue, Nov 5, 2013 at 3:35 PM, Björn Grüning
Re: [galaxy-dev] Question regarding walltime exceeded not being correctly reported via the WebUI
Thanks for the feedback, I have incorporated that previous patch into galaxy-central. As for the new warning message - I think this is fine. It is not surprising the process doesn't have an exit code - the job script itself never got to the point where that would have been written. If there are no observable problems, I wouldn't worry. http://stackoverflow.com/questions/234075/what-is-your-best-programmer-joke/235307#235307 Thanks again, -John On Wed, Nov 6, 2013 at 4:32 PM, Daniel Patrick Sullivan dansulli...@gmail.com wrote: Hi, John, Actually, now that I am taking a look at this, I wanted to report something. I am actually not sure if this is a problem or not (based on what I can tell this is not causing any negative impact). The Galaxy log data is actually reporting that the cleanup failed (for my testing I am using the upload1 tool). galaxy.jobs.runners.pbs DEBUG 2013-11-06 16:04:05,150 (2156/169.sctest.cri.uchicago.edu) PBS job state changed from R to C galaxy.jobs.runners.pbs ERROR 2013-11-06 16:04:05,152 (2156/169.sctest.cri.uchicago.edu) PBS job failed: job maximum walltime exceeded galaxy.datatypes.metadata DEBUG 2013-11-06 16:04:05,389 Cleaning up external metadata files galaxy.datatypes.metadata DEBUG 2013-11-06 16:04:05,421 Failed to cleanup MetadataTempFile temp files from /group/galaxy_test/galaxy-dist/database/job_working_directory/002/2156/metadata_out_HistoryDatasetAssociation_381_8fH0ZU: No JSON object could be decoded: line 1 column 0 (char 0) galaxy.jobs.runners.pbs WARNING 2013-11-06 16:04:05,498 Unable to cleanup: [Errno 2] No such file or directory: '/group/galaxy_test/galaxy-dist/database/pbs/2156.ec' Like I said, as far as I can tell this isn't causing an problem (everything is being reported correctly via the web UI; this was my original problem and you definitely solved it). I figured it wouldn't hurt to report the message above regardless. Thank you again for all of your help. Respectfully yours, Dan Sullivan On Wed, Nov 6, 2013 at 4:10 PM, Daniel Patrick Sullivan dansulli...@gmail.com wrote: Hi, John, Based on my initial testing the application of your patch 2 successfully conveys the job walltime exceeded error to the web UI. As far as I am concerned you resolved this issue perfectly. Thank you so much for your help with this. I will let you know if I experience any additional issues. Respectfully yours, Dan Sullivan On Tue, Nov 5, 2013 at 8:52 PM, John Chilton chil...@msi.umn.edu wrote: On Tue, Nov 5, 2013 at 11:53 AM, Daniel Patrick Sullivan dansulli...@gmail.com wrote: Hi, John, Thank you for taking the time to help me look into this issue. I have applied the patch you provided and confirmed that it appears to help remediate the problem (when a walltime is exceeded feedback is in fact provided via the Galaxy web UI; it no longer appears that jobs are running indefinitely).One thing I would like to note is that the error that is provided to the user is generic, i.e. the web UI reports An error occurred with this dataset: Job cannot be completed due to a cluster error, please retry it later. So, the fact that a Walltime exceeded error actually occurred is not presented to the user (I am not sure if this is intentional or not). Again, I appreciate you taking the time to verify and patch this issue. I have attached a screenshot of the output for your review. Glad we are making progress - I have committed that previous patch to galaxy-central. Lets see if we cannot improve the user feedback so they know they hit the maximum walltime. Can you try this new patch? The message about the timeout was being built but it was not being logged not set as the error message on the dataset - this should resolve that. I am probably going to be testing Galaxy with Torque 4.2.5 in the coming weeks, I will let you know if I identify any additional problems. Thank you so much have a wonderful day. You too, thanks for working with me on fixing this! -John Dan Sullivan On Tue, Nov 5, 2013 at 8:48 AM, John Chilton chil...@msi.umn.edu wrote: Hey Daniel, Thanks so much for the details problem report, it was very helpful. Reviewing the code there appears to be a bug in the PBS job runner - in some cases pbs_job_state.stop_job is never set but is attempted to be read. I don't have torque so I don't have a great test setup for this problem, any chance you can make the following changes for me and let me know if they work? Between the following two lines: log.error( '(%s/%s) PBS job failed: %s' % ( galaxy_job_id, job_id, JOB_EXIT_STATUS.get( int( status.exit_status ), 'Unknown error: %s' % status.exit_status ) ) ) self.work_queue.put( ( self.fail_job, pbs_job_state ) ) log.error( '(%s/%s) PBS job failed: %s' % ( galaxy_job_id, job_id, JOB_EXIT_STATUS.get( int(