Re: [galaxy-dev] Using input dataset names in output dataset names
On Thu, Nov 7, 2013 at 3:50 PM, Peter Cock wrote: Getting back to my motivating example, since fasta_to_tabular.xml does not give the output a label and depends on the default, the small change to $on_string should result in the conversion of a file named My Genes as FASTA-to-Tabular on My Genes, rather than FASTA-to-Tabular on data 1 as now. Here's another variant to keep the data 1 text in $on_string, if people are attached to this functionality. That would result in FASTA-to-Tabular on data 1 (My Genes). ... -- $ hg diff lib/galaxy/tools/actions/__init__.py diff -r 77d58fdd1c2e lib/galaxy/tools/actions/__init__.py --- a/lib/galaxy/tools/actions/__init__.pyTue Oct 29 14:21:48 2013 -0400 +++ b/lib/galaxy/tools/actions/__init__.pyThu Nov 07 15:49:15 2013 + @@ -181,6 +181,7 @@ input_names = [] input_ext = 'data' input_dbkey = incoming.get( dbkey, ? ) +on_text = '' for name, data in inp_data.items(): if not data: data = NoneDataset( datatypes_registry = trans.app.datatypes_registry ) @@ -194,6 +195,8 @@ else: # HDA if data.hid: input_names.append( 'data %s' % data.hid ) +#Will use this on_text if only one input dataset: +on_text = data %s (%s) % (data.id, data.name) input_ext = data.ext if data.dbkey not in [None, '?']: @@ -230,7 +233,10 @@ output_permissions = trans.app.security_agent.history_get_default_permissions( history ) # Build name for output datasets based on tool name and input names if len( input_names ) == 1: -on_text = input_names[0] +#We recorded the dataset name as on_text earlier... +if not on_text: +#Fall back on the shorter 'data %i' style: +on_text = input_names[0] elif len( input_names ) == 2: on_text = '%s and %s' % tuple(input_names[0:2]) elif len( input_names ) == 3: Would this patch be welcomed as a pull request? (Expanding $on_string to include the name as well as dataset number when there is only one input dataset) How about renaming the outputs of the conversion tools? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool Shed packages for BLAST+ binaries
Hi Dave, I think this is a new regression from the Test Tool Shed installation framework, http://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus/f2478dc77ccb Tool test results Automated test environment Tools missing tests or test data Installation errors - no functional tests were run for any tools in this changeset revision Tool dependencies TypeNameVersion blast+ package 2.2.27 Error (Invalid file %s specified, ignoring set_environment_for_install action.,/ToolDepsTest/blast+/2.2.27/iuc/package_blast_plus_2_2_27/eab09bc4d63e/env.sh) Is that really a comma in the path? Is this a simple typo in a config file? Note this tool is using: http://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_27/eab09bc4d63e (using platform specific actions) -- Meanwhile, over on the main Tool Shed, which I would like to update to use arch-specific actions for the packages (since that is now in the stable Galaxy releases), and switch from BLAST+ 2.2.26 to 2.2.27, we have a different install failure: http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/70e7dcbf6573 still using 2.2.26 via shell magic, http://toolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_26/40c69b76b46e /bin/sh: 1: blastn: not found Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Error editing users associated with a quota
Hi all, I'm trying to remove a user from our default quota (while they run a large analysis), but am getting an error: Internal Server Error Galaxy was unable to successfully complete your request An error occurred. This may be an intermittent problem due to load or other unpredictable factors, reloading the page may address the problem. The error has been logged to our team. Has anyone else seen this? Thanks, Peter $ hg branch stable [galaxy@ppserver galaxy-dist]$ hg head | more changeset: 11219:5c789ab4144a branch: stable tag: tip user:Nate Coraor n...@bx.psu.edu date:Mon Nov 04 15:04:42 2013 -0500 summary: Added tag release_2013.11.04 for changeset 26f58e05aa10 changeset: 11216:c458a0fe1ba8 parent: 11213:6d633418ecfa parent: 11215:f79149dd3d35 user:Nate Coraor n...@bx.psu.edu date:Mon Nov 04 14:56:57 2013 -0500 summary: Merge security fix for filtering tools from stable/next-stable. From the pasteur.log file, xxx.xxx.xxx.xxx - - [11/Nov/2013:12:04:45 +0100] POST /galaxy/admin/manage_users_and_groups_for_quota?id=7a72206f0ccffc2e HTTP/1.1 500 - http://xxx/galaxy/admin/manage_users_and_groups_for_quota?webapp=galaxyid=7a72206f0ccffc2e; Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0 Error - type 'exceptions.AttributeError': 'NoQuotaAgent' object has no attribute 'set_entity_quota_associations' URL: http://ppserver/galaxy/admin/manage_users_and_groups_for_quota?id=7a72206f0ccffc2e File '/mnt/galaxy/galaxy-dist/lib/galaxy/web/framework/middleware/error.py', line 149 in __call__ app_iter = self.application(environ, sr_checker) File '/mnt/galaxy/galaxy-dist/eggs/Paste-1.7.5.1-py2.6.egg/paste/recursive.py', line 84 in __call__ return self.application(environ, start_response) File '/mnt/galaxy/galaxy-dist/eggs/Paste-1.7.5.1-py2.6.egg/paste/httpexceptions.py', line 633 in __call__ return self.application(environ, start_response) File '/mnt/galaxy/galaxy-dist/lib/galaxy/web/framework/base.py', line 132 in __call__ return self.handle_request( environ, start_response ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/web/framework/base.py', line 190 in handle_request body = method( trans, **kwargs ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/web/framework/__init__.py', line 229 in decorator return func( self, trans, *args, **kwargs ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/controllers/admin.py', line 531 in manage_users_and_groups_for_quota quota, params = self._quota_op( trans, 'quota_members_edit_button', self._manage_users_and_groups_for_quota, kwd ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/controllers/admin.py', line 664 in _quota_op message = op_method( quota, params ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/actions/admin.py', line 80 in _manage_users_and_groups_for_quota self.app.quota_agent.set_entity_quota_associations( quotas=[ quota ], users=in_users, groups=in_groups ) AttributeError: 'NoQuotaAgent' object has no attribute 'set_entity_quota_associations' CGI Variables - CONTENT_LENGTH: '-1' CONTENT_TYPE: 'application/x-www-form-urlencoded' HTTP_ACCEPT: 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' HTTP_ACCEPT_ENCODING: 'gzip, deflate' HTTP_ACCEPT_LANGUAGE: 'en-gb,en;q=0.7,en-us;q=0.3' HTTP_CONNECTION: 'Keep-Alive' HTTP_COOKIE: 'galaxysession=fed9265fae01dfd85c30fe1bbf5a988ae27edcb17a4961e7cef7bef77b0eeb9be91a902d5f0aacdf' HTTP_DNT: '1' HTTP_HOST: 'ppserver' HTTP_REFERER: 'http://ppserver/galaxy/admin/manage_users_and_groups_for_quota?webapp=galaxyid=7a72206f0ccffc2e' HTTP_USER_AGENT: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0' PATH_INFO: '/admin/manage_users_and_groups_for_quota' QUERY_STRING: 'id=7a72206f0ccffc2e' REMOTE_ADDR: '143.234.97.120' REQUEST_METHOD: 'POST' SCRIPT_NAME: '/galaxy' SERVER_NAME: 'xxx' SERVER_PORT: '8080' SERVER_PROTOCOL: 'HTTP/1.1' WSGI Variables -- application: paste.recursive.RecursiveMiddleware object at 0x910c890 is_api_request: False paste.cookies: (SimpleCookie: galaxysession='fed9265fae01dfd85c30fe1bbf5a988ae27edcb17a4961e7cef7bef77b0eeb9be91a902d5f0aacdf', 'galaxysession=fed9265fae01dfd85c30fe1bbf5a988ae27edcb17a4961e7cef7bef77b0eeb9be91a902d5f0aacdf') paste.expected_exceptions: [class 'paste.httpexceptions.HTTPException'] paste.httpexceptions: paste.httpexceptions.HTTPExceptionHandler object at 0x783fd50 paste.httpserver.thread_pool: paste.httpserver.ThreadPool object at 0x30eea90 paste.parsed_querystring: ([('id', '7a72206f0ccffc2e')], 'id=7a72206f0ccffc2e') paste.recursive.forward: paste.recursive.Forwarder from /galaxy paste.recursive.include: paste.recursive.Includer from /galaxy paste.recursive.include_app_iter: paste.recursive.IncluderAppIter from /galaxy paste.recursive.script_name: '/galaxy' paste.throw_errors: True request_id: '744ee0cc4ac911e3bbcc0026b95a4d69'
[galaxy-dev] Trying to map fastq reads with bowtie for illumina to the reference genome sacCer3
Hello! I have SacCer uploaded to galaxy in the fasta format however to use it as a mapping genome with bowtie for illumina it says it must be in fastq format. Do you know a way that I can convert this file or what I can do to make this work? Thank you very much for your time! Sincerely, Frank ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Converting .gff3 to 12-column .bed
Hello Galaxy: I am trying overall to convert a .gff3 file to 12-column .bed file. I first tried GFF-to-BED converter, but it gave a 6-column .bed file. Then, I tried BED-to-bigBed converter by inputting the 6-column .bed file. I get an error Unspecified genome build, click the pencil icon in the history item to set the genome build. So, I click the pencil icon, and see 4 tabs at the top. I set the Attributes tab as in the attached image (Attributes.png). But then, when I select Convert Format, I am only seeing an option that outputs .bed12 file as Convert Genomic Intervals to Strict BED12. I am a bit confused about this because I specified the input file as a .bed file (and not genomic intervals, unless I am misunderstanding something). In any case, when I select Convert Genomic Intervals to Strict BED12, I do get a .bed file with 12 columns. But I would like to ask if I may have lost information going from the .gff3 to .bed(6) to .bed(12)? (I feel that scores were all set to 0 from .gff3 to .bed(6), and columns 10, 11, 12 (block counts, sizes, and starting positions) were all set to zero going from .bed(6) to .bed(12)). If I am correct that there is information loss, is there a system in Galaxy to prevent this, and transfer as much information as possible from .gff3 to .bed(12)? Thank you. L. Rutter ** Below is a head of my three files (the species is P. dominula): .gff3 file ##gff-version 3 ##date Mon Nov 4 14:54:42 2013 ##source gbrowse gbgff gff3 dumper PdomScaf0001maker gene15 1963. - . Name=PdomGene00025;ID=1;Dbxref=MAKER:maker-PdomScaf0001-snap-gene-0.274 PdomScaf0001maker mRNA15 1963. - . Name=PdomMRNA00025.1;Parent=1;ID=2;_QI=216%7C0%7C0.2%7C0.6%7C0.5%7C0.6%7C5%7C0%7C98;_eAED=0.43;_AED=0.43;Dbxref=MAKER:maker-PdomScaf0001-snap-gene-0.274-mRNA-1 PdomScaf0001maker exon15 100 -0.094 - . Parent=2;ID=3 PdomScaf0001maker CDS 15 100 . - 2 Parent=2;ID=4 PdomScaf0001maker exon223 300 21.8- . Parent=2;ID=5 PdomScaf0001maker CDS 223 300 . - 2 Parent=2;ID=6 PdomScaf0001maker exon717 765 22.4- . Parent=2;ID=7 .bed(6) file PdomScaf000114 1963gene0 - PdomScaf000114 1963mRNA0 - PdomScaf000114 100 exon0 - PdomScaf000114 100 CDS 0 - PdomScaf0001222 300 exon0 - PdomScaf0001222 300 CDS 0 - PdomScaf0001716 765 exon0 - PdomScaf0001716 765 CDS 0 - PdomScaf0001906 947 exon0 - PdomScaf0001906 947 CDS 0 - .bed(12) file PdomScaf000114 1963gene0 - 14 19630 0 , , PdomScaf000114 1963mRNA0 - 14 19630 0 , , PdomScaf000114 100 exon0 - 14 100 0 0 , , PdomScaf000114 100 CDS 0 - 14 100 0 0 , , PdomScaf0001222 300 exon0 - 222 300 0 0 , , PdomScaf0001222 300 CDS 0 - 222 300 0 0 , , PdomScaf0001716 765 exon0 - 716 765 0 0 , , PdomScaf0001716 765 CDS 0 - 716 765 0 0 , , PdomScaf0001906 947 exon0 - 906 947 0 0 , , PdomScaf0001906 947 CDS 0 - 906 947 0 0 , , ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Using input dataset names in output dataset names
I have not tested the patch, just read it, but won't this result in dataset names like: Some operation on data 27 (Some operation on data 26 (Some other operation on data 25 (...(...(... (avoiding this is why we came up with HIDs in the first place). -- James Taylor, Associate Professor, Biology/CS, Emory University On Mon, Nov 11, 2013 at 5:42 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Nov 7, 2013 at 3:50 PM, Peter Cock wrote: Getting back to my motivating example, since fasta_to_tabular.xml does not give the output a label and depends on the default, the small change to $on_string should result in the conversion of a file named My Genes as FASTA-to-Tabular on My Genes, rather than FASTA-to-Tabular on data 1 as now. Here's another variant to keep the data 1 text in $on_string, if people are attached to this functionality. That would result in FASTA-to-Tabular on data 1 (My Genes). ... -- $ hg diff lib/galaxy/tools/actions/__init__.py diff -r 77d58fdd1c2e lib/galaxy/tools/actions/__init__.py --- a/lib/galaxy/tools/actions/__init__.pyTue Oct 29 14:21:48 2013 -0400 +++ b/lib/galaxy/tools/actions/__init__.pyThu Nov 07 15:49:15 2013 + @@ -181,6 +181,7 @@ input_names = [] input_ext = 'data' input_dbkey = incoming.get( dbkey, ? ) +on_text = '' for name, data in inp_data.items(): if not data: data = NoneDataset( datatypes_registry = trans.app.datatypes_registry ) @@ -194,6 +195,8 @@ else: # HDA if data.hid: input_names.append( 'data %s' % data.hid ) +#Will use this on_text if only one input dataset: +on_text = data %s (%s) % (data.id, data.name) input_ext = data.ext if data.dbkey not in [None, '?']: @@ -230,7 +233,10 @@ output_permissions = trans.app.security_agent.history_get_default_permissions( history ) # Build name for output datasets based on tool name and input names if len( input_names ) == 1: -on_text = input_names[0] +#We recorded the dataset name as on_text earlier... +if not on_text: +#Fall back on the shorter 'data %i' style: +on_text = input_names[0] elif len( input_names ) == 2: on_text = '%s and %s' % tuple(input_names[0:2]) elif len( input_names ) == 3: Would this patch be welcomed as a pull request? (Expanding $on_string to include the name as well as dataset number when there is only one input dataset) How about renaming the outputs of the conversion tools? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool Shed packages for BLAST+ binaries
This is not an objection, but do we need bash? Can we live with posix sh? We should ask this question about every requirement we introduce. (bash is not part of the default installation of FreeBSD or OpenBSD for example. bash is unfortunately licensed under GPLV3, so if you are trying to create an OS not polluted by viral licensing you don't get bash). On Mon, Oct 7, 2013 at 11:36 AM, John Chilton chil...@msi.umn.edu wrote: My own preference is that we specify at least /bin/sh and /bin/bash are available before utilizing the tool shed. Is there an objection to this from any corner? Is there realistically a system that Galaxy should support that will not have /bin/bash available? -- James Taylor, Associate Professor, Biology/CS, Emory University ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Using input dataset names in output dataset names
On Mon, Nov 11, 2013 at 4:09 PM, James Taylor ja...@jamestaylor.org wrote: I have not tested the patch, just read it, but won't this result in dataset names like: Some operation on data 27 (Some operation on data 26 (Some other operation on data 25 (...(...(... Potentially - it depends on how the tools use $on_string. If the tools added a postscript you'd get: Original dataset (as tabular) (filtered) (...) Neither is ideal. I'd prefer to see something more like this tag idea: https://trello.com/c/JnhOEqow What about my suggestion that for simple format conversion tools we simply reuse the input dataset's name unchanged (without text about the conversion)? That seems a good compromise. (avoiding this is why we came up with HIDs in the first place). I don't like the HIDs - unlike dataset names, the HIDs are not entirely reproducible - they depend on the order of upload, was it a clear history, etc. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Using input dataset names in output dataset names
*I had been composing this e-mail for a while so it is a lot awkward given this mornings responses, but felt it best to just get it out there rather than continue to bake the ideas :)* If I were not employed by Penn State, I would say you guys should be using galaxy-extras - these problems are all solved by multiple file datasets :), but since I am I am not going to mention that. I agree with Peter, the tag idea is probably a better way to get around this and probably represents an improvement on HIDs. There are a lot of open tickets related to things like this so I have picked one at random and sketched out what I think the path forward should maybe be. https://trello.com/c/dQA7Y5vS As mentioned by James, the problem with Peter's first attached patch is that after several iterations the name gets bigger and bigger. The tags patch put together or at least linked to by Bjoern does limits should limit the size of output names over a workflow right? The down side is that it is not used by default - tool authors have use it. So, my vote would be to combine the approaches. Specify this new labeling attribute (I would call it on_name_tag_string instead of on_tag_string because tags have other meanings in Galaxy), then provide a Galaxy configuration option that would use this instead of on_string by default for all tools (or maybe just replace on_string with on_name_tag_string) so that tools that explicitly use on_string would pick up the enhancements as well. Galaxy Main wouldn't have to change its default, but institutions who deem the name tag more important could. -John On Mon, Nov 11, 2013 at 10:22 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Mon, Nov 11, 2013 at 4:09 PM, James Taylor ja...@jamestaylor.org wrote: I have not tested the patch, just read it, but won't this result in dataset names like: Some operation on data 27 (Some operation on data 26 (Some other operation on data 25 (...(...(... Potentially - it depends on how the tools use $on_string. If the tools added a postscript you'd get: Original dataset (as tabular) (filtered) (...) Neither is ideal. I'd prefer to see something more like this tag idea: https://trello.com/c/JnhOEqow What about my suggestion that for simple format conversion tools we simply reuse the input dataset's name unchanged (without text about the conversion)? That seems a good compromise. (avoiding this is why we came up with HIDs in the first place). I don't like the HIDs - unlike dataset names, the HIDs are not entirely reproducible - they depend on the order of upload, was it a clear history, etc. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Converting .gff3 to 12-column .bed
Hello, There are no tools directly on the public Galaxy site to transform a GFF3 dataset into a BED12 dataset. However, the Tool Shed has a repository called ' fml_gff3togtf' that includes a tool for this purpose, for use in a local install. The description is a bit bothersome in that it a slightly incorrect datatype statement, so be sure to test out the results. (the word wiggle has no place in this statement: gff3_to_bed_converter.py: This tool converts gene transcript annotation from GFF3 format to UCSC wiggle 12 column BED format.) http://getgalaxy.org http://usegalaxy.org/toolshed I see your post at Biostar, and it might be helpful to let you know what a BED12 file represents (plus I'll post this there, may help others): http://www.biostars.org/p/85869/ A BED12 file describes the complete, often spliced, alignment of a sequence to a reference genome. This does not include minor base variation, it is a macro alignment. You can think of each of the blocks as being exons, although there is no magic here - if the sequence or genome had quality problems, or significant variation (large insertion or deletion), that could cause the alignment to fragment as well. Here is the data description: http://wiki.galaxyproject.org/Learn/Datatypes#Bed To see examples, at UCSC (genome.ucsc.edu), EST or mRNA track will have this as the primary table format. All gene track can also be in BED12 format, or in a related one, genePred: http://genome.ucsc.edu/FAQ/FAQformat.html#format9 UCSC also has line-command utilities to convert between the formats, pre-compiled versions are here: http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads Either way, you can convert the data, then load up into the public Galaxy (usegalaxy.org) and proceed with your analysis. BEDTools works well with BED12 files. There is definitely information loss attempting to transform BED6 - BED12, as the global alignment is lost. And adjusting attributes such as score or name are often a preference, so you can alter these however you want, as long as the attribute formatting rules for the columns are followed. Hopefully this helps, Jen Galaxy team On 11/9/13 3:29 PM, lrutter @iastate.edu wrote: Hello Galaxy: I am trying overall to convert a .gff3 file to 12-column .bed file. I first tried GFF-to-BED converter, but it gave a 6-column .bed file. Then, I tried BED-to-bigBed converter by inputting the 6-column .bed file. I get an error Unspecified genome build, click the pencil icon in the history item to set the genome build. So, I click the pencil icon, and see 4 tabs at the top. I set the Attributes tab as in the attached image (Attributes.png). But then, when I select Convert Format, I am only seeing an option that outputs .bed12 file as Convert Genomic Intervals to Strict BED12. I am a bit confused about this because I specified the input file as a .bed file (and not genomic intervals, unless I am misunderstanding something). In any case, when I select Convert Genomic Intervals to Strict BED12, I do get a .bed file with 12 columns. But I would like to ask if I may have lost information going from the .gff3 to .bed(6) to .bed(12)? (I feel that scores were all set to 0 from .gff3 to .bed(6), and columns 10, 11, 12 (block counts, sizes, and starting positions) were all set to zero going from .bed(6) to .bed(12)). If I am correct that there is information loss, is there a system in Galaxy to prevent this, and transfer as much information as possible from .gff3 to .bed(12)? Thank you. L. Rutter ** Below is a head of my three files (the species is P. dominula): .gff3 file ##gff-version 3 ##date Mon Nov 4 14:54:42 2013 ##source gbrowse gbgff gff3 dumper PdomScaf0001maker gene15 1963. - . Name=PdomGene00025;ID=1;Dbxref=MAKER:maker-PdomScaf0001-snap-gene-0.274 PdomScaf0001maker mRNA15 1963. - . Name=PdomMRNA00025.1;Parent=1;ID=2;_QI=216%7C0%7C0.2%7C0.6%7C0.5%7C0.6%7C5%7C0%7C98;_eAED=0.43;_AED=0.43;Dbxref=MAKER:maker-PdomScaf0001-snap-gene-0.274-mRNA-1 PdomScaf0001maker exon15 100 -0.094 - . Parent=2;ID=3 PdomScaf0001maker CDS 15 100 . - 2 Parent=2;ID=4 PdomScaf0001maker exon223 300 21.8- . Parent=2;ID=5 PdomScaf0001maker CDS 223 300 . - 2 Parent=2;ID=6 PdomScaf0001maker exon717 765 22.4- . Parent=2;ID=7 .bed(6) file PdomScaf000114 1963gene0 - PdomScaf000114 1963mRNA0 - PdomScaf000114 100 exon0 - PdomScaf000114 100 CDS 0 - PdomScaf0001222 300 exon0 - PdomScaf0001222 300 CDS 0 - PdomScaf0001716 765 exon0 - PdomScaf0001716 765 CDS 0 - PdomScaf0001906 947 exon0
Re: [galaxy-dev] Samtools and idxstats
On Thu, Nov 7, 2013 at 4:51 PM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi Michiel, Did you finish wrapping samtools idxstats? I can't see it on the Tool Shed... If not, I may tackle this shortly. Peter I've implemented a 'samtools idxstats' wrapper, which seems to be working for me: https://github.com/peterjc/pico_galaxy/blob/master/tools/samtools_idxstats/samtools_idxstats.xml Test Tool Shed: http://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats Main Tool Shed (pending - I want to see the overnight test results on the Test Tool Shed first): http://toolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Converting .gff3 to 12-column .bed
Hi, I am the author for fml_gff3togtf tool package, currently merged into our instance at http://galaxy.cbio.mskcc.org, The tool can be accessed with following link: https://galaxy.cbio.mskcc.org/tool_runner?tool_id=fml_gff2bed --/Vipin Sloan-Kettering Institute http://galaxy.cbio.mskcc.org On Mon, Nov 11, 2013 at 12:15 PM, Jennifer Jackson j...@bx.psu.edu wrote: Hello, There are no tools directly on the public Galaxy site to transform a GFF3 dataset into a BED12 dataset. However, the Tool Shed has a repository called ' fml_gff3togtf' that includes a tool for this purpose, for use in a local install. The description is a bit bothersome in that it a slightly incorrect datatype statement, so be sure to test out the results. (the word wiggle has no place in this statement: gff3_to_bed_converter.py: This tool converts gene transcript annotation from GFF3 format to UCSC wiggle 12 column BED format.) http://getgalaxy.org http://usegalaxy.org/toolshed I see your post at Biostar, and it might be helpful to let you know what a BED12 file represents (plus I'll post this there, may help others): http://www.biostars.org/p/85869/ A BED12 file describes the complete, often spliced, alignment of a sequence to a reference genome. This does not include minor base variation, it is a macro alignment. You can think of each of the blocks as being exons, although there is no magic here - if the sequence or genome had quality problems, or significant variation (large insertion or deletion), that could cause the alignment to fragment as well. Here is the data description: http://wiki.galaxyproject.org/Learn/Datatypes#Bed To see examples, at UCSC (genome.ucsc.edu), EST or mRNA track will have this as the primary table format. All gene track can also be in BED12 format, or in a related one, genePred: http://genome.ucsc.edu/FAQ/FAQformat.html#format9 UCSC also has line-command utilities to convert between the formats, pre-compiled versions are here: http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads Either way, you can convert the data, then load up into the public Galaxy ( usegalaxy.org) and proceed with your analysis. BEDTools works well with BED12 files. There is definitely information loss attempting to transform BED6 - BED12, as the global alignment is lost. And adjusting attributes such as score or name are often a preference, so you can alter these however you want, as long as the attribute formatting rules for the columns are followed. Hopefully this helps, Jen Galaxy team On 11/9/13 3:29 PM, lrutter @iastate.edu wrote: Hello Galaxy: I am trying overall to convert a .gff3 file to 12-column .bed file. I first tried GFF-to-BED converter, but it gave a 6-column .bed file. Then, I tried BED-to-bigBed converter by inputting the 6-column .bed file. I get an error Unspecified genome build, click the pencil icon in the history item to set the genome build. So, I click the pencil icon, and see 4 tabs at the top. I set the Attributes tab as in the attached image (Attributes.png). But then, when I select Convert Format, I am only seeing an option that outputs .bed12 file as Convert Genomic Intervals to Strict BED12. I am a bit confused about this because I specified the input file as a .bed file (and not genomic intervals, unless I am misunderstanding something). In any case, when I select Convert Genomic Intervals to Strict BED12, I do get a .bed file with 12 columns. But I would like to ask if I may have lost information going from the .gff3 to .bed(6) to .bed(12)? (I feel that scores were all set to 0 from .gff3 to .bed(6), and columns 10, 11, 12 (block counts, sizes, and starting positions) were all set to zero going from .bed(6) to .bed(12)). If I am correct that there is information loss, is there a system in Galaxy to prevent this, and transfer as much information as possible from .gff3 to .bed(12)? Thank you. L. Rutter ** Below is a head of my three files (the species is P. dominula): .gff3 file ##gff-version 3 ##date Mon Nov 4 14:54:42 2013 ##source gbrowse gbgff gff3 dumper PdomScaf0001maker gene15 1963. - . Name=PdomGene00025;ID=1;Dbxref=MAKER:maker-PdomScaf0001-snap-gene-0.274 PdomScaf0001maker mRNA15 1963. - . Name=PdomMRNA00025.1;Parent=1;ID=2;_QI=216%7C0%7C0.2%7C0.6%7C0.5%7C0.6%7C5%7C0%7C98;_eAED=0.43;_AED=0.43;Dbxref=MAKER:maker-PdomScaf0001-snap-gene-0.274-mRNA-1 PdomScaf0001maker exon15 100 -0.094 - . Parent=2;ID=3 PdomScaf0001maker CDS 15 100 . - 2 Parent=2;ID=4 PdomScaf0001maker exon223 300 21.8- . Parent=2;ID=5 PdomScaf0001maker CDS 223 300 . - 2 Parent=2;ID=6 PdomScaf0001maker exon717 765 22.4- . Parent=2;ID=7 .bed(6) file PdomScaf000114 1963gene0
[galaxy-dev] DRMAA/SGE job handling regression?
Hello all, On our main Galaxy tracking galaxy-dist using DRMAA/SGE, jobs submitted to the cluster and queued and waiting (qw) are correctly shown in Galaxy as grey pending entries in the history. With my test instance tracking galaxy-central (along with a new visual look and new icons), such jobs are wrongly shown as yellow (running). Is this a general regression affecting other people? There also seem to be issues where killing a job in Galaxy just hides it but it remains running (yellow once you tick show deleted datasets, and running on SGE too). This was working properly on galaxy-dist (the job was killed on the cluster, and shown as red if you ticked show deleted datasets). Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] SLURM and hidden success
This is really odd. I see no code in the job runner stuff at all that could cause this behavior outside the context of the dataset being marked hidden as part of a workflow - let alone something DRM specific that could cause this. Are you rerunning an existing job that has been marked this way in a workflow. Does this happen if you click new tools outside the context of workflows or past jobs. Can you find the corresponding dataset via the history API or in the database and determine if they indeed are having visible set to False - this I guess is what should cause a dataset to become hidden? -John On Fri, Nov 8, 2013 at 11:40 AM, Andrew Warren anwar...@vbi.vt.edu wrote: Hello all, We are in the process of switching from SGE to SLURM for our galaxy setup. We are currently experiencing a problem where jobs that are completely successful (no text in their stderr file and the proper exit code) are being hidden after the job completes. Any job that fails or has some text in the stderr file is not hidden (note: hidden not deleted; they can be viewed by selecting 'Unhide Hidden Datasets'). Our drmaa.py is at changeset 10961:432999eabbaa Our drmaa egg is at drmaa = 0.6 And our SLURM version is 2.3.5 And we are currently passing no parameters for default_cluster_job_runner = drmaa:/// We have the same code base on both clusters but only observe this behavior when using SLURM. Any pointers or advice would be greatly appreciated. Thanks, Andrew ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Error editing users associated with a quota
Hey Peter, Does your universe_wsgi.ini have enable_quotas = True set? -John On Mon, Nov 11, 2013 at 6:08 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, I'm trying to remove a user from our default quota (while they run a large analysis), but am getting an error: Internal Server Error Galaxy was unable to successfully complete your request An error occurred. This may be an intermittent problem due to load or other unpredictable factors, reloading the page may address the problem. The error has been logged to our team. Has anyone else seen this? Thanks, Peter $ hg branch stable [galaxy@ppserver galaxy-dist]$ hg head | more changeset: 11219:5c789ab4144a branch: stable tag: tip user:Nate Coraor n...@bx.psu.edu date:Mon Nov 04 15:04:42 2013 -0500 summary: Added tag release_2013.11.04 for changeset 26f58e05aa10 changeset: 11216:c458a0fe1ba8 parent: 11213:6d633418ecfa parent: 11215:f79149dd3d35 user:Nate Coraor n...@bx.psu.edu date:Mon Nov 04 14:56:57 2013 -0500 summary: Merge security fix for filtering tools from stable/next-stable. From the pasteur.log file, xxx.xxx.xxx.xxx - - [11/Nov/2013:12:04:45 +0100] POST /galaxy/admin/manage_users_and_groups_for_quota?id=7a72206f0ccffc2e HTTP/1.1 500 - http://xxx/galaxy/admin/manage_users_and_groups_for_quota?webapp=galaxyid=7a72206f0ccffc2e; Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0 Error - type 'exceptions.AttributeError': 'NoQuotaAgent' object has no attribute 'set_entity_quota_associations' URL: http://ppserver/galaxy/admin/manage_users_and_groups_for_quota?id=7a72206f0ccffc2e File '/mnt/galaxy/galaxy-dist/lib/galaxy/web/framework/middleware/error.py', line 149 in __call__ app_iter = self.application(environ, sr_checker) File '/mnt/galaxy/galaxy-dist/eggs/Paste-1.7.5.1-py2.6.egg/paste/recursive.py', line 84 in __call__ return self.application(environ, start_response) File '/mnt/galaxy/galaxy-dist/eggs/Paste-1.7.5.1-py2.6.egg/paste/httpexceptions.py', line 633 in __call__ return self.application(environ, start_response) File '/mnt/galaxy/galaxy-dist/lib/galaxy/web/framework/base.py', line 132 in __call__ return self.handle_request( environ, start_response ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/web/framework/base.py', line 190 in handle_request body = method( trans, **kwargs ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/web/framework/__init__.py', line 229 in decorator return func( self, trans, *args, **kwargs ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/controllers/admin.py', line 531 in manage_users_and_groups_for_quota quota, params = self._quota_op( trans, 'quota_members_edit_button', self._manage_users_and_groups_for_quota, kwd ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/controllers/admin.py', line 664 in _quota_op message = op_method( quota, params ) File '/mnt/galaxy/galaxy-dist/lib/galaxy/actions/admin.py', line 80 in _manage_users_and_groups_for_quota self.app.quota_agent.set_entity_quota_associations( quotas=[ quota ], users=in_users, groups=in_groups ) AttributeError: 'NoQuotaAgent' object has no attribute 'set_entity_quota_associations' CGI Variables - CONTENT_LENGTH: '-1' CONTENT_TYPE: 'application/x-www-form-urlencoded' HTTP_ACCEPT: 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' HTTP_ACCEPT_ENCODING: 'gzip, deflate' HTTP_ACCEPT_LANGUAGE: 'en-gb,en;q=0.7,en-us;q=0.3' HTTP_CONNECTION: 'Keep-Alive' HTTP_COOKIE: 'galaxysession=fed9265fae01dfd85c30fe1bbf5a988ae27edcb17a4961e7cef7bef77b0eeb9be91a902d5f0aacdf' HTTP_DNT: '1' HTTP_HOST: 'ppserver' HTTP_REFERER: 'http://ppserver/galaxy/admin/manage_users_and_groups_for_quota?webapp=galaxyid=7a72206f0ccffc2e' HTTP_USER_AGENT: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0' PATH_INFO: '/admin/manage_users_and_groups_for_quota' QUERY_STRING: 'id=7a72206f0ccffc2e' REMOTE_ADDR: '143.234.97.120' REQUEST_METHOD: 'POST' SCRIPT_NAME: '/galaxy' SERVER_NAME: 'xxx' SERVER_PORT: '8080' SERVER_PROTOCOL: 'HTTP/1.1' WSGI Variables -- application: paste.recursive.RecursiveMiddleware object at 0x910c890 is_api_request: False paste.cookies: (SimpleCookie: galaxysession='fed9265fae01dfd85c30fe1bbf5a988ae27edcb17a4961e7cef7bef77b0eeb9be91a902d5f0aacdf', 'galaxysession=fed9265fae01dfd85c30fe1bbf5a988ae27edcb17a4961e7cef7bef77b0eeb9be91a902d5f0aacdf') paste.expected_exceptions: [class 'paste.httpexceptions.HTTPException'] paste.httpexceptions: paste.httpexceptions.HTTPExceptionHandler object at 0x783fd50 paste.httpserver.thread_pool: paste.httpserver.ThreadPool object at 0x30eea90 paste.parsed_querystring: ([('id', '7a72206f0ccffc2e')], 'id=7a72206f0ccffc2e') paste.recursive.forward: paste.recursive.Forwarder from
Re: [galaxy-dev] SLURM and hidden success
Hi John, Thanks so much for the reply. After investigating this more today it turns out, as you might have suspected, SLURM was a red herring. Despite our attempts to faithfully rsync everything between the two servers it looks like there was a problem with our workflows in the new database. Strangely every single workflow that was previously created had a hide action set for their outputs despite the past and present configuration of the tool wrappers. Any newly created workflow does not display this behavior. It happens that the steps with errors were being displayed despite the workflow weirdness thanks to an error check in lib/galaxy/jobs/actions/post.py Thanks again, Andrew On Mon, Nov 11, 2013 at 1:39 PM, John Chilton chil...@msi.umn.edu wrote: This is really odd. I see no code in the job runner stuff at all that could cause this behavior outside the context of the dataset being marked hidden as part of a workflow - let alone something DRM specific that could cause this. Are you rerunning an existing job that has been marked this way in a workflow. Does this happen if you click new tools outside the context of workflows or past jobs. Can you find the corresponding dataset via the history API or in the database and determine if they indeed are having visible set to False - this I guess is what should cause a dataset to become hidden? -John On Fri, Nov 8, 2013 at 11:40 AM, Andrew Warren anwar...@vbi.vt.edu wrote: Hello all, We are in the process of switching from SGE to SLURM for our galaxy setup. We are currently experiencing a problem where jobs that are completely successful (no text in their stderr file and the proper exit code) are being hidden after the job completes. Any job that fails or has some text in the stderr file is not hidden (note: hidden not deleted; they can be viewed by selecting 'Unhide Hidden Datasets'). Our drmaa.py is at changeset 10961:432999eabbaa Our drmaa egg is at drmaa = 0.6 And our SLURM version is 2.3.5 And we are currently passing no parameters for default_cluster_job_runner = drmaa:/// We have the same code base on both clusters but only observe this behavior when using SLURM. Any pointers or advice would be greatly appreciated. Thanks, Andrew ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool Shed packages for BLAST+ binaries
Peter, Thanks for bringing this to our attention, we're working on fixing a number of issues with the test framework, and hope to have more information for you tomorrow. --Dave B. On 2013-11-11 06:11, Peter Cock wrote: Hi Dave, I think this is a new regression from the Test Tool Shed installation framework, http://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus/f2478dc77ccb Tool test results Automated test environment Tools missing tests or test data Installation errors - no functional tests were run for any tools in this changeset revision Tool dependencies TypeNameVersion blast+ package 2.2.27 Error (Invalid file %s specified, ignoring set_environment_for_install action.,/ToolDepsTest/blast+/2.2.27/iuc/package_blast_plus_2_2_27/eab09bc4d63e/env.sh) Is that really a comma in the path? Is this a simple typo in a config file? Note this tool is using: http://testtoolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_27/eab09bc4d63e (using platform specific actions) -- Meanwhile, over on the main Tool Shed, which I would like to update to use arch-specific actions for the packages (since that is now in the stable Galaxy releases), and switch from BLAST+ 2.2.26 to 2.2.27, we have a different install failure: http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/70e7dcbf6573 still using 2.2.26 via shell magic, http://toolshed.g2.bx.psu.edu/view/iuc/package_blast_plus_2_2_26/40c69b76b46e /bin/sh: 1: blastn: not found Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Galaxy environment on local resources
Hello all, Feel free to point it out if I have missed something obvious, I have done a fair bit of investigation and haven't quite found the solution yet. We have some hardware that has been around for a while for the purpose of processing Genetic data and other related tasks. To this end Galaxy fits the bill nicely in that it enables researchers to analyse data without being Linux geeks. The problem I have is that while the hand built galaxy server (running on SUSE for historical reasons) we have works to a point it is difficult to maintain and installing new tools and reference genomes is fiddly at best given that our server doesn't conform to the way the instructions for other systems expect it to work. We have had success using Cloudman on AWS to run training on how to use galaxy, and I would like to know more about how to customise and instance to contain all of the tools (mostly the NGS tools) we need by default. Ultimately we wont be able to use AWS to process much of the real data we have, because of a need to keep the data we are processing in-house due to ethics agreements. Fortunately we do have access to a modest pool of hardware (which is about to get bigger) to implement some kind of private cloud solution. How would I go about setting up a private cloud version of the cloudman style galaxy instance on demand system where researchers can start an instance, have it connect to a shared storage volume and process some data then terminate the instance? And is this even the best way to go? I have found it should be possible to use the scripts to install and configure the galaxy instances but I have not found any information on how to setup the environment that is required to make this work as a private cloud. Conversely I have found information about some private cloud scenarios such as Eucalyptis, and OpenStack but have not been able to join the dots to determine how to and if I can make the cloudman/galaxy usecase work on it. I should mention that I'm primarily a Windows Sys Admin (who dabbles in Linux) who is looking at this due to a lack of a dedicated Linux admin. At the end of the day I need to be able to setup this system and make it as low maintenance as possible whilst being useful and accessible to the researchers who aren't Linux admins. Any advice gratefully accepted. Regards, Alistair ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/