Re: [galaxy-dev] SQLalchemy InvalidRequestError
/properties.py, line 998, in _process_dependent_arguments self.target = self.mapper.mapped_table File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/util/langhelpers.py, line 494, in __get__ obj.__dict__[self.__name__] = result = self.fget(obj) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/properties.py, line 891, in mapper mapper_ = mapper.class_mapper(self.argument(), File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/ext/declarative.py, line 1428, in return_cls (prop.parent, arg, n.args[0], cls) InvalidRequestError: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name (name 'Message' is not defined). If this is a class name, consider adding this relationship() to the class 'kombu.transport.sqlalchemy.Queue' class after both dependent classes have been defined. --- After that it starts throwing the exception in monitor_step that I previously posted. Has anyone seen a potentially related issue? Would an update to the latest galaxy code help? I see there are newer versions of SQLAlchemy available. Are they part of a newer code base? Thanks a lot for your help Ulf On 07/10/14 12:20, Ulf Schaefer wrote: Update: The usual switching it off and on again (server reboot) has resolved the problem (for now), albeit in a rather unsatisfactory manner. If there are any insights what caused this behaviour and how it can be avoided in the future I'd be more than happy to hear them. Cheers Ulf On 07/10/14 11:04, Dannon Baker wrote: One per second? Can you tell me more about your configuration? This is an odd bug with multiple mapper initialization that I haven't been able to reproduce yet, so any information will help. Database configuration, number of processes, etc. On Oct 7, 2014 11:46 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all Maybe one of you can shed some light on this error message that I see in the log file for one of my handler processes. I get about one of them per second. The effect is that most of the jobs remain in the waiting to run stage. The postgres database is running on a separate server and appear to be doing just fine. Any help is greatly appreciated. Thanks Ulf --- galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step Traceback (most recent call last): File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 161, in __monitor self.__monitor_step() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 184, in __monitor_step hda_not_ready = self.sa_session.query(model.Job.id).enable_eagerloads(False) \ File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py, line 114, in do return getattr(self.registry(), name)(*args, **kwargs) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py, line 1088, in query return self._query_cls(entities, self, **kwargs) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 108, in __init__ self._set_entities(entities) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 117, in _set_entities self._setup_aliasizers(self._entities) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 132, in _setup_aliasizers _entity_info(entity) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py, line 578, in _entity_info mapperlib.configure_mappers() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py, line 2260, in configure_mappers raise e InvalidRequestError: One or more mappers failed to initialize - can't proceed with initialization of other mappers. Original exception was: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name (name 'Message' is not defined). If this is a class name, consider adding this relationship() to the class 'kombu.transport.sqlalchemy.Queue' class after both dependent classes have been defined. --- ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named
Re: [galaxy-dev] SQLalchemy InvalidRequestError
Hi Dannon Yes. The database is running on a different server from the server running Galaxy. They are both VMs running Centos (6.5 on the Galaxy server, 6.2 on the database server). The postgres version is 8.4.9 and the database size is 712,161,040. I suspect that is not very large compared to some others. There are a number of other databases running on the same server, the one most frequently used is for our test Galaxy server which runs on yet a different VM. This one is much smaller (25,319,184). Both servers are on the same subnet. The problem is with our production Galaxy (of course). Are there any instructions around, how to implement a rabbitmq for my Galaxy? Thanks for looking into this. Ulf On 08/10/14 11:26, Dannon Baker wrote: Hi again Ulf, Thanks for the info. A few questions to help me track this down: Does the postgres database reside on a remote box from galaxy? And is it very large? Running the latest galaxy may not change anything related to this particular issue, but you could always try it. Sqlalchemy is fixed at the latest version we can currently support without reworking how migration scripts function (which we will do, moving to Alembic, in the future), and I do suspect that this is actually a bug in sqlalchemy mapper initialization, but we should be able to come up with an interim work around. Finally, if this is a blocker for you while it's not trivial(and I still am going to fox this bug), setting up an amqp (rabbitmq) server and configuring your galaxy instances to communicate using that is a workaround. On Oct 8, 2014 10:45 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Hi all again Seems I am not so fortunate that this would just go away. It appear to be happening sometimes at start-up time for one of the handler processes. The first thing that appears to go wrong is this just after starting the job handler queue: --- galaxy.jobs.handler INFO 2014-10-06 14:37:51,220 job handler queue started galaxy.sample_tracking.external_service_types DEBUG 2014-10-06 14:37:51,246 Loaded external_service_type: Simple unknown sequencer 1.0.0 galaxy.sample_tracking.external_service_types DEBUG 2014-10-06 14:37:51,253 Loaded external_service_type: Applied Biosystems SOLiD 1.0.0 galaxy.queue_worker INFO 2014-10-06 14:37:51,254 Initalizing Galaxy Queue Worker on sqlalchemy+postgres://galaxy:xxx@158.119.147.86:5432/galaxyprod galaxy.jobs DEBUG 2014-10-06 14:37:51,416 (78355) Working directory for job is: /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/database/job_working_directory/078/78355 galaxy.web.framework.base DEBUG 2014-10-06 14:37:51,454 Enabling 'data_admin' controller, class: DataAdmin galaxy.jobs.handler ERROR 2014-10-06 14:37:51,464 failure running job 78355 Traceback (most recent call last): File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 243, in __monitor_step job_state = self.__check_if_ready_to_run( job ) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 333, in __check_if_ready_to_run state = self.__check_user_jobs( job, self.job_wrappers[job.id] ) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 417, in __check_user_jobs if job.user: File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/attributes.py, line 168, in __get__ return self.impl.get(instance_state(instance),dict_) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/attributes.py, line 453, in get value = self.callable_(state, passive) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py, line 508, in _load_for_state return self._emit_lazyload(session, state, ident_key) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py, line 552, in _emit_lazyload return q._load_on_ident(ident_key) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 2512, in _load_on_ident return q.one() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 2184, in one ret = list(self) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 2227, in __iter__ return self._execute_and_instances(context) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 2240, in _execute_and_instances close_with_result=True) File
[galaxy-dev] SQLalchemy InvalidRequestError
Dear all Maybe one of you can shed some light on this error message that I see in the log file for one of my handler processes. I get about one of them per second. The effect is that most of the jobs remain in the waiting to run stage. The postgres database is running on a separate server and appear to be doing just fine. Any help is greatly appreciated. Thanks Ulf --- galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step Traceback (most recent call last): File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 161, in __monitor self.__monitor_step() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 184, in __monitor_step hda_not_ready = self.sa_session.query(model.Job.id).enable_eagerloads(False) \ File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py, line 114, in do return getattr(self.registry(), name)(*args, **kwargs) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py, line 1088, in query return self._query_cls(entities, self, **kwargs) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 108, in __init__ self._set_entities(entities) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 117, in _set_entities self._setup_aliasizers(self._entities) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 132, in _setup_aliasizers _entity_info(entity) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py, line 578, in _entity_info mapperlib.configure_mappers() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py, line 2260, in configure_mappers raise e InvalidRequestError: One or more mappers failed to initialize - can't proceed with initialization of other mappers. Original exception was: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name (name 'Message' is not defined). If this is a class name, consider adding this relationship() to the class 'kombu.transport.sqlalchemy.Queue' class after both dependent classes have been defined. --- ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] SQLalchemy InvalidRequestError
Hi Dannon I am running 6 handler and 6 web processes. I have the latest stable version of the code (from June 2nd 2014). My postgres database (version 8.4.9) is running on a different server on the same subnet. Both servers are Centos (6.5 on the Galaxy server, 6.2 on the database server). Jobs are supposed to be dispatched to a dedicated queue on a cluster running Univa Grid Engine. But I don't think the jobs get dispatched to the cluster because of a previous database communication problem. The reason that I have so many error messages might be that there were quite a number of jobs waiting to run. Please let me know if you want to know anything more specific. I'd be happy to send any configuration files. Thanks for your help. Ulf On 07/10/14 11:04, Dannon Baker wrote: One per second? Can you tell me more about your configuration? This is an odd bug with multiple mapper initialization that I haven't been able to reproduce yet, so any information will help. Database configuration, number of processes, etc. On Oct 7, 2014 11:46 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all Maybe one of you can shed some light on this error message that I see in the log file for one of my handler processes. I get about one of them per second. The effect is that most of the jobs remain in the waiting to run stage. The postgres database is running on a separate server and appear to be doing just fine. Any help is greatly appreciated. Thanks Ulf --- galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step Traceback (most recent call last): File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 161, in __monitor self.__monitor_step() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 184, in __monitor_step hda_not_ready = self.sa_session.query(model.Job.id).enable_eagerloads(False) \ File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py, line 114, in do return getattr(self.registry(), name)(*args, **kwargs) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py, line 1088, in query return self._query_cls(entities, self, **kwargs) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 108, in __init__ self._set_entities(entities) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 117, in _set_entities self._setup_aliasizers(self._entities) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 132, in _setup_aliasizers _entity_info(entity) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py, line 578, in _entity_info mapperlib.configure_mappers() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py, line 2260, in configure_mappers raise e InvalidRequestError: One or more mappers failed to initialize - can't proceed with initialization of other mappers. Original exception was: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name (name 'Message' is not defined). If this is a class name, consider adding this relationship() to the class 'kombu.transport.sqlalchemy.Queue' class after both dependent classes have been defined. --- ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists
Re: [galaxy-dev] SQLalchemy InvalidRequestError
Update: The usual switching it off and on again (server reboot) has resolved the problem (for now), albeit in a rather unsatisfactory manner. If there are any insights what caused this behaviour and how it can be avoided in the future I'd be more than happy to hear them. Cheers Ulf On 07/10/14 11:04, Dannon Baker wrote: One per second? Can you tell me more about your configuration? This is an odd bug with multiple mapper initialization that I haven't been able to reproduce yet, so any information will help. Database configuration, number of processes, etc. On Oct 7, 2014 11:46 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all Maybe one of you can shed some light on this error message that I see in the log file for one of my handler processes. I get about one of them per second. The effect is that most of the jobs remain in the waiting to run stage. The postgres database is running on a separate server and appear to be doing just fine. Any help is greatly appreciated. Thanks Ulf --- galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step Traceback (most recent call last): File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 161, in __monitor self.__monitor_step() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, line 184, in __monitor_step hda_not_ready = self.sa_session.query(model.Job.id).enable_eagerloads(False) \ File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py, line 114, in do return getattr(self.registry(), name)(*args, **kwargs) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py, line 1088, in query return self._query_cls(entities, self, **kwargs) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 108, in __init__ self._set_entities(entities) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 117, in _set_entities self._setup_aliasizers(self._entities) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py, line 132, in _setup_aliasizers _entity_info(entity) File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py, line 578, in _entity_info mapperlib.configure_mappers() File /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py, line 2260, in configure_mappers raise e InvalidRequestError: One or more mappers failed to initialize - can't proceed with initialization of other mappers. Original exception was: When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' failed to locate a name (name 'Message' is not defined). If this is a class name, consider adding this relationship() to the class 'kombu.transport.sqlalchemy.Queue' class after both dependent classes have been defined. --- ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms
[galaxy-dev] find bam index files
Dear all For each bam file in my history I can download the associated bai (bam index) file. I assume these files are stored somewhere under /mount/galaxy/database/files/_metadata_files. Correct? Is there an easy way to find the bam index file for a bam file, given only the internal file name of the bam (e.g. /mont/galaxy/database/files/089/dataset_89231.dat)? I am asking because I would like to use the files_to_ftp tool to automatically download bams together with their associated indices. Thanks Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Dear Nate, dear Peter Again, sorry for the delay in replying. Yes I can. It looks like this [galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat [galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/ blastdb.nhd blastdb.nhi blastdb.nhr blastdb.nin blastdb.nog blastdb.nsd blastdb.nsi blastdb.nsq I think the simplest solution would be to put something in the primary file. Just a short string that gets the file size above 0. I personally have followed you initial suggestion and made the dbs available globally via the .loc file. Thanks again Ulf On 28/07/14 09:43, Peter Cock wrote: On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear Nate, dear Peter Sorry for the delay in replying. I can import both HTML and blastdb from a history to a data library. If I try to get the data out of the library into anothre history, I am successful for the html but not for the blastdb. The problem seems to be that the primary data file (the /path/dataset_12345.dat) is empty for the blastdb, while the html primary file has something in it. OK. Can you tell where Galaxy thinks the library files are on disk, and check to see if the folder of BLAST database files is actually there? When I try to import the blastdb (from library to history) there is a message along the lines of can't import empty file. I hypothesise (admittedly without having looked at a line of code) that there is a test for file size 0 somewhere that is either altogether unnecessary or, more likely, does not take into account that for composite datatypes it might be completely legitimate for the primary file to be empty. This guess makes sense - but I've not yet tried to trace through the code either. Or is my primary blastdb file not supposed to be empty in the first place? I can blast against it just fine. The BLAST databases do not define/populate a primary file, so Galaxy seems to create a dummy empty file on its own. I have wondered about altering the BLAST database datatype definition to have a human readable text file as the primary file (i.e. the information currently saved as a text log file when creating a database). Thanks a lot for your help Ulf You too - you've found an interesting bug... Peter ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Dear Nate, dear Peter Sorry for the delay in replying. I can import both HTML and blastdb from a history to a data library. If I try to get the data out of the library into anothre history, I am successful for the html but not for the blastdb. The problem seems to be that the primary data file (the /path/dataset_12345.dat) is empty for the blastdb, while the html primary file has something in it. When I try to import the blastdb (from library to history) there is a message along the lines of can't import empty file. I hypothesise (admittedly without having looked at a line of code) that there is a test for file size 0 somewhere that is either altogether unnecessary or, more likely, does not take into account that for composite datatypes it might be completely legitimate for the primary file to be empty. Or is my primary blastdb file not supposed to be empty in the first place? I can blast against it just fine. Thanks a lot for your help Ulf On 24/07/14 15:02, Peter Cock wrote: On Thu, Jul 24, 2014 at 2:50 PM, Nate Coraor n...@bx.psu.edu wrote: On Jul 23, 2014, at 6:42 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Interesting hypothesis - you may well be right. Galaxy guys - who is the expert to talk to on this and/or where in the code should we be looking? Thanks, Peter I think there's a bit of a mixup here - Peter, I believe you were asking if other composite types with an html primary dataset could be imported from the history to library, but Ulf, your test was the other direction (library-history). I'd be interested in knowing the outcome of the history-library test as well. Good catch - yes, that was what I was asking about. Ulf? I am woefully ignorant about the blastdbn datatype. Is the primary file supposed to be html type but empty? The BLAST databases are 'basic' composite datatypes, of which the most commonly used example is HTML (and some bits of the base class code code seem to assume HTML). This means testing if something works with HTML is a good first step. https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes Peter ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Providing BLAST db in a data library
Dear all I have several smallish BLAST databases that I would like to provide in a data library. I create them in a history with the makeblastdb tool and them try to add them to the library. I see that for each blast db there is an empty file created (like /path/dataset_12345.dat) and a folder with the same name (/path/dataset_12345_files/) that contains the actual db files (blastdb.n*). In my library the blastdb shows up empty and I cannot import it back to another history. I does not seem to be aware of the _files folder, despite it being the right data type (blastdbn). Any ideas what I am doing wrong? Thanks a lot for your help Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Dear Peter Thanks for your reply. I can import an html report (e.g. FastQC output) successfully into a new history from a data library. But the .dat file for the html is not empty like the one for the blastdb. Makes me think that I could do this with a blast db as well, if only it would not check for size 0 at the time of importing it. Thanks Ulf On 23/07/14 10:56, Peter Cock wrote: On Wed, Jul 23, 2014 at 10:47 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all I have several smallish BLAST databases that I would like to provide in a data library. I create them in a history with the makeblastdb tool and them try to add them to the library. I see that for each blast db there is an empty file created (like /path/dataset_12345.dat) and a folder with the same name (/path/dataset_12345_files/) that contains the actual db files (blastdb.n*). In my library the blastdb shows up empty and I cannot import it back to another history. I does not seem to be aware of the _files folder, despite it being the right data type (blastdbn). Any ideas what I am doing wrong? Thanks a lot for your help Ulf Hi Ulf, I've never tried that. It could be a bug in Galaxy importing composite datatypes into a library, or something in the BLAST database definition which needs fixing. Does importing an HTML report (with child files like images) into a library work for you? (This is another composite datatype so a useful comparison). Rather than using Data Libraries, we just list all the locally installed shared BLAST databases via the BLAST *.loc files instead. Note using the *.loc files makes the databases available to all the Galaxy users, while with a Data Library you can control access to specific groups/roles. Regards, Peter ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Rename output from a repeat
Hi all We frequently use the syntax below to rename outputs of workflows that we run in batch. It is convenient to have sample names from fastqs carried over to sams, bams, vcfs, etc. #{input1 | basename}.bam This does not seem to be working for inputs that are in repeats, e.g. the VelvetOptimiser. Does anybody know if there is a syntax to make this work, maybe #{repeatname[0].input1 | basename}.bam ? Thanks a lot for your help Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Limits for enumerate for multiple input files
Dear all I am using this control to allow the user to input multiple files: param name=input_vcfs type=data multiple=true format=vcf label=Input VCF file(s) / and I am using this for loop in the cheetah code to access the control: command interpreter=bash script.sh ... #for $i, $input_vcf in enumerate( $input_vcfs ): ${input_vcf}, #end for /command It appears that when a user selects many files (25 in this case) the bash command in the command tag never gets executed. Therefore the job is never queued. The history item shows 'Waiting to run' indefinitely. Calling the script.sh manually with 25 input files works fine. Any hint as to how to debug this would be greatly appreciated. Thanks a lot Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Limits for enumerate for multiple input files
Hi Peter I removed the unnecessary code. If I run the tool with just a couple of inputs I see entries in the log files either from galaxy.jobs.runners.drmaa or from galaxy.jobs.runners.local that the job is being dispatched as normal. Unfortunately there is no sign of the job in the log files when using more input files. The command line that is supposed to be run is: bash home/galaxy/galaxy-dist/tools/vcf_processing/vcf_to_fasta.sh /galaxy/database/files/042/dataset_42275.dat 40 10 50 0 40 0.9 20 /galaxy/database/files/041/dataset_41720.dat, /galaxy/database/files/041/dataset_41980.dat, the first dat file being the output and the ones at the end being a comma separated list of the input files. On the command line this command works with much longer input files lists. Any ideas? Or is there a better practice to pass a large number of input files to a bash script? Thanks Ulf On 22/04/14 15:26, Peter Cock wrote: On Tue, Apr 22, 2014 at 3:02 PM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all I am using this control to allow the user to input multiple files: param name=input_vcfs type=data multiple=true format=vcf label=Input VCF file(s) / and I am using this for loop in the cheetah code to access the control: command interpreter=bash script.sh ... #for $i, $input_vcf in enumerate( $input_vcfs ): ${input_vcf}, #end for /command The $i and enumerate seem unnecessary here. It appears that when a user selects many files (25 in this case) the bash command in the command tag never gets executed. Therefore the job is never queued. The history item shows 'Waiting to run' indefinitely. Calling the script.sh manually with 25 input files works fine. Any hint as to how to debug this would be greatly appreciated. Thanks a lot Ulf Can you see anything in the log about the job, and in particular the command line it would attempt to run? Peter ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] sort list of ftp uploaded files
Dear John, dear Curtis, and all all that is required to fix my current issue is that the files show up in the 'Upload File' tool form in alphabetical order. I assume that at some point an 'ls' command is done on the users' ftp home folder. The result of that could possibly be stored in a hash of some sort, which could explain why the order is randomised when the files come back out to go on the form. But I might be wrong. Of course a better but more involved solution would be if the 'Files uploaded via FTP' table was sortable, e.g. by clicking on the table headers. Any idea where I might find the code that populates this table? Thanks a lot for your help. Cheers Ulf On 17/12/13 22:31, Curtis Hendrickson (Campus) wrote: John, I can't speak for Ulf, but a more general solution would be to allow sorting by a standard unix SORT set keys. That would allow things like - sort by the sample ID after the 3rd dash, then by the read direction. -t - -k1,1n -k2,2r # http://unixhelp.ed.ac.uk/CGI/man-cgi?sort Regards, Curtis PS - granted simple filename sort is a good place to start and a lot better than nothing. Date sort is often useful, too. -Original Message- From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of John Chilton Sent: Tuesday, December 17, 2013 4:03 PM To: Ulf Schaefer Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] sort list of ftp uploaded files If Galaxy just sorted these files alphabetically before display/import would that fix your problem? Or do your users need to be able to modify the order? -John On Mon, Dec 16, 2013 at 9:19 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all Is there a way to sort the list of files that a user has uploaded via FTP? Currently they appear on the form of the tool 'Upload File' in random order it seems. There is not much documentation on the param type=ftpfile / tag unfortunately. If the user just selects all files they end up in his/her history in the same random order, which causes a slight inconvenience when using these files as paired multiple input for a workflow. The obvious work-around is to put the files into the history one pair at a time, but this becomes a bit onerous for larger number of files. Thanks a lot for your help. Cheers Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] sort list of ftp uploaded files
Dear John Thanks very much. This is the perfect (minimal) solution that I need right now. I am looking forward to your work on pairs of data. It sounds like it will be exceptionally useful for us. Thanks again and best wishes Ulf On 18/12/13 14:29, John Chilton wrote: Couple things... I just pushed a commit to galaxy-central to sort those files by default - that is a clear improvement: https://bitbucket.org/galaxy/galaxy-central/commits/ce186eb5fefcb7ff332c73bd0869b9e93d0b Clearly not enough though - it would be nice to have for more advanced sorting options though so I have created a Trello card - https://trello.com/c/0hgUrW4H If you really wanted to dig into this there are many files that may be relevant - templates/embed_base.mako lib/galaxy/web/form_builder.py lib/galaxy/tools/parameters/basic.py lib/galaxy/web/framework/helpers/grids.py Hopefully this helps. Also just as a heads up, coming from a different direction me and Martin are actively working building abstractions for pairs of data and that will have to include a nice UI for building these pairs - our plan is to start with library imports but we will try to architect the creation widget to be able to target history items as well. This would largely negate the need to import them in any particular order. This will likely not be in the next release, but hopefully the following one (I can build pairs via the API now, but I want to be able to do useful stuff with them before committing :) ). -John On Wed, Dec 18, 2013 at 2:37 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear John, dear Curtis, and all all that is required to fix my current issue is that the files show up in the 'Upload File' tool form in alphabetical order. I assume that at some point an 'ls' command is done on the users' ftp home folder. The result of that could possibly be stored in a hash of some sort, which could explain why the order is randomised when the files come back out to go on the form. But I might be wrong. Of course a better but more involved solution would be if the 'Files uploaded via FTP' table was sortable, e.g. by clicking on the table headers. Any idea where I might find the code that populates this table? Thanks a lot for your help. Cheers Ulf On 17/12/13 22:31, Curtis Hendrickson (Campus) wrote: John, I can't speak for Ulf, but a more general solution would be to allow sorting by a standard unix SORT set keys. That would allow things like - sort by the sample ID after the 3rd dash, then by the read direction. -t - -k1,1n -k2,2r # http://unixhelp.ed.ac.uk/CGI/man-cgi?sort Regards, Curtis PS - granted simple filename sort is a good place to start and a lot better than nothing. Date sort is often useful, too. -Original Message- From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of John Chilton Sent: Tuesday, December 17, 2013 4:03 PM To: Ulf Schaefer Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] sort list of ftp uploaded files If Galaxy just sorted these files alphabetically before display/import would that fix your problem? Or do your users need to be able to modify the order? -John On Mon, Dec 16, 2013 at 9:19 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all Is there a way to sort the list of files that a user has uploaded via FTP? Currently they appear on the form of the tool 'Upload File' in random order it seems. There is not much documentation on the param type=ftpfile / tag unfortunately. If the user just selects all files they end up in his/her history in the same random order, which causes a slight inconvenience when using these files as paired multiple input for a workflow. The obvious work-around is to put the files into the history one pair at a time, but this becomes a bit onerous for larger number of files. Thanks a lot for your help. Cheers Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http
[galaxy-dev] sort list of ftp uploaded files
Dear all Is there a way to sort the list of files that a user has uploaded via FTP? Currently they appear on the form of the tool 'Upload File' in random order it seems. There is not much documentation on the param type=ftpfile / tag unfortunately. If the user just selects all files they end up in his/her history in the same random order, which causes a slight inconvenience when using these files as paired multiple input for a workflow. The obvious work-around is to put the files into the history one pair at a time, but this becomes a bit onerous for larger number of files. Thanks a lot for your help. Cheers Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] workflow batch execution dynamic parameters
Dear all Related to my question yesterday I have another one: If I run a workflow in batch I can select pairs of files (the old fwd and rev reads fastq files) and the workflow is run however many times, depending on the number of pairs I select. Unfortunately all other parameters appear to be static, which is not ideal in my case. Specifically I am using bwa the map a large-ish number of (paired) samples to the same reference genome, while simultaneously setting read group information in the resulting sam files. So it's suboptimal to run this with a fixed SM value. Is there any way of dynamically setting the parameters? The set-at-runtime option also allows for a single parameter value, not a list. Ideally the set-at-runtime option would somehow allow me to input one parameter per file pair. Thanks very much in advance for your help. Cheers Ulf On 01/10/13 14:09, Joachim Jacob | VIB | wrote: Sorry, my answer doesn't fit your question. :-) J Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team On 10/01/2013 03:08 PM, Joachim Jacob | VIB | wrote: Hi Ulf, What I do: 1. make a history, doing the steps you want to do on one input file 2. create a workflow of that history 3. assemble all input files in one history 4. run the workflow and select the multiple input files to run the workflow on. 5. Optionally: send the results to a new history, for every input file you will get a new history, properly named. Hope this helps, Joachim Joachim Jacob Contact details: http://www.bits.vib.be/index.php/about/80-team On 10/01/2013 02:27 PM, Ulf Schaefer wrote: Dear all We frequently find ourselves in situations where a tool needs to be run with a lot of input files. For example, run the GATK UnifiedGenotyper with easily dozens of bam files. Using the repeat in this case requires quite a bit of clicking. Is there a more conventient way fo doing this? Maybe similar to the multi-file-select that is possible for workflow inputs? I saw some older discussions on this or similar issues, but I am a bit lost what the current official stable proposed solution for this is. Thanks for your help Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] multiple input files
Dear all We frequently find ourselves in situations where a tool needs to be run with a lot of input files. For example, run the GATK UnifiedGenotyper with easily dozens of bam files. Using the repeat in this case requires quite a bit of clicking. Is there a more conventient way fo doing this? Maybe similar to the multi-file-select that is possible for workflow inputs? I saw some older discussions on this or similar issues, but I am a bit lost what the current official stable proposed solution for this is. Thanks for your help Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] multiple input files
Dear Peter, dear John, dear all Thanks very much. This is exactly what I need and the code changes proposed work perfectly as they are. At a short glance even the validators work as intended. Now for the optional special bonus: Is there a way to define the size of the boxes? The ones I see have 4 lines. Can I make them resizable somehow or define a larger size? (size=10 or resize=true does not seem to do the trick.) Not a very pressing issue though. @Joachim: I am using the same approach elsewhere, but the difference is that it runs a tool multiple times, instead of running it once on multiple inputs. But thanks anyway. Cheers Ulf On 01/10/13 14:08, John Chilton wrote: Thanks Peter, Yeah, looking over unified_genotyper.xml, you would probably just want to replace: #for $i, $input_bam in enumerate( $reference_source.input_bams ): -d -I ${input_bam.input_bam} ${input_bam.input_bam.ext} gatk_input_${i} #if str( $input_bam.input_bam.metadata.bam_index ) != None: -d ${input_bam.input_bam.metadata.bam_index} bam_index gatk_input_${i} ##hardcode galaxy ext type as bam_index #end if #end for with #for $i, $input_bam in enumerate( $reference_source.input_bams ): -d -I ${input_bam} ${input_bam.ext} gatk_input_${i} #if str( $input_bam.metadata.bam_index ) != None: -d ${input_bam.metadata.bam_index} bam_index gatk_input_${i} ##hardcode galaxy ext type as bam_index #end if #end for And: repeat name=input_bams title=BAM file min=1 help=-I,--input_file amp;lt;input_fileamp;gt; param name=input_bam type=data format=bam label=BAM file validator type=unspecified_build / validator type=dataset_metadata_in_data_table table_name=gatk_picard_indexes metadata_name=dbkey metadata_column=dbkey message=Sequences are not currently available for the specified build. / !-- fixme!!! this needs to be a select -- /param /repeat with param name=input_bams type=data multiple=true format=bam label=BAM file validator type=unspecified_build / validator type=dataset_metadata_in_data_table table_name=gatk_picard_indexes metadata_name=dbkey metadata_column=dbkey message=Sequences are not currently available for the specified build. / !-- fixme!!! this needs to be a select -- /param And: repeat name=input_bams title=BAM file min=1 help=-I,--input_file amp;lt;input_fileamp;gt; param name=input_bam type=data format=bam label=BAM file /param /repeat with param name=input_bams multiple=true type=data format=bam label=BAM file /param I have not tested these specific changes, so your millage may vary. I have no clue if those validators in that second block are going to work with multiple=true. If you test it out and there is some problem, please let me know I can try to fix it. I don't know if the Galaxy team wants to start replacing these blocks in its tools - this change would break existing workflows built on the unified genotyper and I am not sure the Galaxy team has any interest in using these kind of data blocks going forward. They are working out great for us on Galaxy-P though. -John On Tue, Oct 1, 2013 at 7:57 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Tue, Oct 1, 2013 at 1:27 PM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Dear all We frequently find ourselves in situations where a tool needs to be run with a lot of input files. For example, run the GATK UnifiedGenotyper with easily dozens of bam files. Using the repeat in this case requires quite a bit of clicking. Is there a more conventient way fo doing this? Maybe similar to the multi-file-select that is possible for workflow inputs? I saw some older discussions on this or similar issues, but I am a bit lost what the current official stable proposed solution for this is. Thanks for your help Ulf Use param type=data multiple=true ... instead of using repeat ...param type=data .../repeat. I asked John Chilton about this recently via Twitter, and it is now on the wiki, http://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax Simple example here: https://bitbucket.org/galaxyp/galaxyp-toolshed-iquant/src/tip/iquant.xml?at=default Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ** The information contained in the EMail and any attachments is confidential and intended solely
Re: [galaxy-dev] Visualisation of VCF files in trackster
Dear Jeremy Thank you for your reply and sorry for not being clear. In short I solved the problem. Below is some info, in case this is useful for someone else. Thanks for your help The situation was: On Main: Visualisation of the SAM/BAM file - OK Visualisation of the VCF file - OK On my local install: Visualisation of the SAM/BAM file - OK Visualisation of the VCF file - FAIL The reason is that this command fails: grep -v '^#' /data/database/files/000/dataset_596.dat | sort -k1,1 | bedtools genomecov -bg -split -i stdin -g /data/database/files/000/dataset_598.dat temp.bg ; bedGraphToBigWig temp.bg /data/database/files/000/dataset_598.dat /data/database/files/000/dataset_609.dat with Input error: found interval with block-counts not matching starts/sizes Where dataset_596.dat is my vcf and /data/database/files/000/dataset_598.dat is my genome file. This is produced by the bedtools genomecov bit of the command, which appears to have some sort of problem with the vcf input in combination with the -split option. The problem disappears with the installation of the latest version of bedtools (v2.17.0), but if you are using the version that you get from yum (v2.15.0) you run into this error. Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] phylib file type
Hello all Has anyone successfully integrated the phylib file type into Galaxy in their local install? It's a standard file type for multiple sequence alignment and I could not possibly be the only one who want to support it in a local install. I saw the instructions on the wiki for adding new file types. I would guess that in addition to the changes in the datatypes_conf.xml I would have to derive a new class in galaxy.datatypes.sequence from the existing Alignment class. Is there a commit where this is already included or will I actually have to code it myself? How can I find out? Thanks a lot Ulf PS.: hg tip changeset: 10201:ebe87051fadf tag: tip parent: 10199:8bf64d933704 user:Dannon Baker dannonba...@me.com date:Tue Jul 02 10:48:31 2013 -0400 summary: Fix two more downgrade invocations to accept the migrate_engine parameter ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] phylib file type
Dear Peter Thank you for your email. Yes. This is the file type I mean. Sorry for the typo. Any chance there is a pretty class with appropriate sniff functions etc. somewhere? Currently it looks a little bit like a hack on my side. Thanks a lot for your help Ulf On 11/09/13 10:16, Peter Cock wrote: On Wed, Sep 11, 2013 at 9:46 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote: Hello all Has anyone successfully integrated the phylib file type into Galaxy in their local install? It's a standard file type for multiple sequence alignment and I could not possibly be the only one who want to support it in a local install. Do you mean phylip with a p at the end? If so try this: http://toolshed.g2.bx.psu.edu/view/devteam/emboss_datatypes http://emboss.sourceforge.net/docs/themes/SequenceFormats.html Note however there are both strict and relaxed interpretations of the PHYLIP standard (the original version imposed taxon name limits), and also both interlaced/interleaved and sequential forms. The EMBOSS repository has phylipnon for non-interleaved (i.e. sequential) PHYLIP format , and phylip for either. Peter ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Visualisation of VCF files in trackster
Dear all My attempts to visualise a vcf file in trackster fail. I can visualise the corresponding SAM/BAM files fine. I can visualise the same file on Galaxy main too, but not in my local install. I receive the following error message: Trackster Error Input error: found interval with block-counts not matching starts/sizes on line. sort: write failed: standard output: Broken pipe sort: write error needLargeMem: trying to allocate 0 bytes (limit: 1000) The vcf format is in version 4.1. I am creating it from the sorted BAM file with samtools mpileup in the most basic way. What am I missing? Thanks a lot for your help Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] disable PBKDF2, revert to SHA1
Hello all It was indicated in several recent posts (e.g. there http://dev.list.galaxyproject.org/support-pbkdf2-in-proftpd-1-3-5rc3-td4660836.html) that for Galaxy to reliably work with ProFTPD, it is recommended to disable the new PBKDF2 encryption of passwords and revert back to SHA1. Unfortunately it appears to be beyond me to figure out how to do this. Could somebody point me in the right direction, please? Thanks a lot Ulf ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/