Re: [galaxy-dev] SQLalchemy InvalidRequestError

2014-10-08 Thread Ulf Schaefer
/properties.py,
 
line 998, in _process_dependent_arguments
 self.target = self.mapper.mapped_table
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/util/langhelpers.py,
 
line 494, in __get__
 obj.__dict__[self.__name__] = result = self.fget(obj)
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/properties.py,
 
line 891, in mapper
 mapper_ = mapper.class_mapper(self.argument(),
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/ext/declarative.py,
 
line 1428, in return_cls
 (prop.parent, arg, n.args[0], cls)
InvalidRequestError: When initializing mapper Mapper|Queue|kombu_queue, 
expression 'Message' failed to locate a name (name 'Message' is not 
defined). If this is a class name, consider adding this relationship() 
to the class 'kombu.transport.sqlalchemy.Queue' class after both 
dependent classes have been defined.

---

After that it starts throwing the exception in monitor_step that I 
previously posted. Has anyone seen a potentially related issue? Would an 
update to the latest galaxy code help? I see there are newer versions of 
SQLAlchemy available. Are they part of a newer code base?

Thanks a lot for your help
Ulf

On 07/10/14 12:20, Ulf Schaefer wrote:
 Update:

 The usual switching it off and on again (server reboot) has resolved the
 problem (for now), albeit in a rather unsatisfactory manner.

 If there are any insights what caused this behaviour and how it can be
 avoided in the future I'd be more than happy to hear them.

 Cheers
 Ulf

 On 07/10/14 11:04, Dannon Baker wrote:
 One per second?  Can you tell me more about your configuration?   This is
 an odd bug with multiple mapper initialization that I haven't been able to
 reproduce yet, so any information will help.  Database configuration,
 number of processes, etc.
 On Oct 7, 2014 11:46 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:

 Dear all

 Maybe one of you can shed some light on this error message that I see in
 the log file for one of my handler processes. I get about one of them
 per second. The effect is that most of the jobs remain in the waiting
 to run stage.

 The postgres database is running on a separate server and appear to be
 doing just fine.

 Any help is greatly appreciated.

 Thanks
 Ulf

 ---

 galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step
 Traceback (most recent call last):
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 161, in __monitor
self.__monitor_step()
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 184, in __monitor_step
hda_not_ready =
 self.sa_session.query(model.Job.id).enable_eagerloads(False) \
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py,
 line 114, in do
return getattr(self.registry(), name)(*args, **kwargs)
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py,
 line 1088, in query
return self._query_cls(entities, self, **kwargs)
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 108, in __init__
self._set_entities(entities)
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 117, in _set_entities
self._setup_aliasizers(self._entities)
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 132, in _setup_aliasizers
_entity_info(entity)
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py,
 line 578, in _entity_info
mapperlib.configure_mappers()
  File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py,
 line 2260, in configure_mappers
raise e
 InvalidRequestError: One or more mappers failed to initialize - can't
 proceed with initialization of other mappers.  Original exception was:
 When initializing mapper Mapper|Queue|kombu_queue, expression 'Message'
 failed to locate a name (name 'Message' is not defined). If this is a
 class name, consider adding this relationship() to the class
 'kombu.transport.sqlalchemy.Queue' class after both dependent classes
 have been defined.

 ---

 **
 The information contained in the EMail and any attachments is confidential
 and intended solely and for the attention and use of the named

Re: [galaxy-dev] SQLalchemy InvalidRequestError

2014-10-08 Thread Ulf Schaefer
Hi Dannon

Yes. The database is running on a different server from the server 
running Galaxy. They are both VMs running Centos (6.5 on the Galaxy 
server, 6.2 on the database server). The postgres version is 8.4.9 and 
the database size is 712,161,040. I suspect that is not very large 
compared to some others. There are a number of other databases running 
on the same server, the one most frequently used is for our test Galaxy 
server which runs on yet a different VM. This one is much smaller 
(25,319,184). Both servers are on the same subnet. The problem is with 
our production Galaxy (of course).

Are there any instructions around, how to implement a rabbitmq for my 
Galaxy?

Thanks for looking into this.
Ulf

On 08/10/14 11:26, Dannon Baker wrote:
 Hi again Ulf,

 Thanks for the info. A few questions to help me track this down:

 Does the postgres database reside on a remote box from galaxy?  And is it
 very large?

 Running the latest galaxy may not change anything related to this
 particular issue, but you could always try it.

 Sqlalchemy is fixed at the latest version we can currently support without
 reworking how migration scripts function (which we will do, moving to
 Alembic, in the future), and I do suspect that this is actually a bug in
 sqlalchemy mapper initialization, but we should be able to come up with an
 interim work around.

 Finally, if this is a blocker for you while it's not trivial(and I still am
 going to fox this bug), setting up an amqp (rabbitmq) server and
 configuring your galaxy instances to communicate using that is a workaround.
 On Oct 8, 2014 10:45 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:

 Hi all again

 Seems I am not so fortunate that this would just go away.

 It appear to be happening sometimes at start-up time for one of the
 handler processes. The first thing that appears to go wrong is this just
 after starting the job handler queue:

 ---

 galaxy.jobs.handler INFO 2014-10-06 14:37:51,220 job handler queue started
 galaxy.sample_tracking.external_service_types DEBUG 2014-10-06
 14:37:51,246 Loaded external_service_type: Simple unknown sequencer 1.0.0
 galaxy.sample_tracking.external_service_types DEBUG 2014-10-06
 14:37:51,253 Loaded external_service_type: Applied Biosystems SOLiD 1.0.0
 galaxy.queue_worker INFO 2014-10-06 14:37:51,254 Initalizing Galaxy
 Queue Worker on
 sqlalchemy+postgres://galaxy:xxx@158.119.147.86:5432/galaxyprod
 galaxy.jobs DEBUG 2014-10-06 14:37:51,416 (78355) Working directory for
 job is:

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/database/job_working_directory/078/78355
 galaxy.web.framework.base DEBUG 2014-10-06 14:37:51,454 Enabling
 'data_admin' controller, class: DataAdmin
 galaxy.jobs.handler ERROR 2014-10-06 14:37:51,464 failure running job 78355
 Traceback (most recent call last):
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 243, in __monitor_step
   job_state = self.__check_if_ready_to_run( job )
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 333, in __check_if_ready_to_run
   state = self.__check_user_jobs( job, self.job_wrappers[job.id] )
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 417, in __check_user_jobs
   if job.user:
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/attributes.py,
 line 168, in __get__
   return self.impl.get(instance_state(instance),dict_)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/attributes.py,
 line 453, in get
   value = self.callable_(state, passive)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py,
 line 508, in _load_for_state
   return self._emit_lazyload(session, state, ident_key)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/strategies.py,
 line 552, in _emit_lazyload
   return q._load_on_ident(ident_key)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 2512, in _load_on_ident
   return q.one()
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 2184, in one
   ret = list(self)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 2227, in __iter__
   return self._execute_and_instances(context)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 2240, in _execute_and_instances
   close_with_result=True)
 File

[galaxy-dev] SQLalchemy InvalidRequestError

2014-10-07 Thread Ulf Schaefer
Dear all

Maybe one of you can shed some light on this error message that I see in 
the log file for one of my handler processes. I get about one of them 
per second. The effect is that most of the jobs remain in the waiting 
to run stage.

The postgres database is running on a separate server and appear to be 
doing just fine.

Any help is greatly appreciated.

Thanks
Ulf

---

galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step
Traceback (most recent call last):
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, 
line 161, in __monitor
 self.__monitor_step()
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py, 
line 184, in __monitor_step
 hda_not_ready = 
self.sa_session.query(model.Job.id).enable_eagerloads(False) \
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py,
 
line 114, in do
 return getattr(self.registry(), name)(*args, **kwargs)
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py,
 
line 1088, in query
 return self._query_cls(entities, self, **kwargs)
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 
line 108, in __init__
 self._set_entities(entities)
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 
line 117, in _set_entities
 self._setup_aliasizers(self._entities)
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 
line 132, in _setup_aliasizers
 _entity_info(entity)
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py,
 
line 578, in _entity_info
 mapperlib.configure_mappers()
   File 
/phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py,
 
line 2260, in configure_mappers
 raise e
InvalidRequestError: One or more mappers failed to initialize - can't 
proceed with initialization of other mappers.  Original exception was: 
When initializing mapper Mapper|Queue|kombu_queue, expression 'Message' 
failed to locate a name (name 'Message' is not defined). If this is a 
class name, consider adding this relationship() to the class 
'kombu.transport.sqlalchemy.Queue' class after both dependent classes 
have been defined.

---

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] SQLalchemy InvalidRequestError

2014-10-07 Thread Ulf Schaefer
Hi Dannon

I am running 6 handler and 6 web processes. I have the latest stable 
version of the code (from June 2nd 2014). My postgres database (version 
8.4.9) is running on a different server on the same subnet. Both servers 
are Centos (6.5 on the Galaxy server, 6.2 on the database server). Jobs 
are supposed to be dispatched to a dedicated queue on a cluster running 
Univa Grid Engine. But I don't think the jobs get dispatched to the 
cluster because of a previous database communication problem.

The reason that I have so many error messages might be that there were 
quite a number of jobs waiting to run.

Please let me know if you want to know anything more specific. I'd be 
happy to send any configuration files.

Thanks for your help.
Ulf

On 07/10/14 11:04, Dannon Baker wrote:
 One per second?  Can you tell me more about your configuration?   This is
 an odd bug with multiple mapper initialization that I haven't been able to
 reproduce yet, so any information will help.  Database configuration,
 number of processes, etc.
 On Oct 7, 2014 11:46 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:

 Dear all

 Maybe one of you can shed some light on this error message that I see in
 the log file for one of my handler processes. I get about one of them
 per second. The effect is that most of the jobs remain in the waiting
 to run stage.

 The postgres database is running on a separate server and appear to be
 doing just fine.

 Any help is greatly appreciated.

 Thanks
 Ulf

 ---

 galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step
 Traceback (most recent call last):
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 161, in __monitor
   self.__monitor_step()
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 184, in __monitor_step
   hda_not_ready =
 self.sa_session.query(model.Job.id).enable_eagerloads(False) \
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py,
 line 114, in do
   return getattr(self.registry(), name)(*args, **kwargs)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py,
 line 1088, in query
   return self._query_cls(entities, self, **kwargs)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 108, in __init__
   self._set_entities(entities)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 117, in _set_entities
   self._setup_aliasizers(self._entities)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 132, in _setup_aliasizers
   _entity_info(entity)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py,
 line 578, in _entity_info
   mapperlib.configure_mappers()
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py,
 line 2260, in configure_mappers
   raise e
 InvalidRequestError: One or more mappers failed to initialize - can't
 proceed with initialization of other mappers.  Original exception was:
 When initializing mapper Mapper|Queue|kombu_queue, expression 'Message'
 failed to locate a name (name 'Message' is not defined). If this is a
 class name, consider adding this relationship() to the class
 'kombu.transport.sqlalchemy.Queue' class after both dependent classes
 have been defined.

 ---

 **
 The information contained in the EMail and any attachments is confidential
 and intended solely and for the attention and use of the named
 addressee(s). It may not be disclosed to any other person without the
 express authority of Public Health England, or the intended recipient, or
 both. If you are not the intended recipient, you must not disclose, copy,
 distribute or retain this message or any part of it. This footnote also
 confirms that this EMail has been swept for computer viruses by
 Symantec.Cloud, but please re-sweep any attachments before opening or
 saving. http://www.gov.uk/PHE
 **

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists

Re: [galaxy-dev] SQLalchemy InvalidRequestError

2014-10-07 Thread Ulf Schaefer
Update:

The usual switching it off and on again (server reboot) has resolved the 
problem (for now), albeit in a rather unsatisfactory manner.

If there are any insights what caused this behaviour and how it can be 
avoided in the future I'd be more than happy to hear them.

Cheers
Ulf

On 07/10/14 11:04, Dannon Baker wrote:
 One per second?  Can you tell me more about your configuration?   This is
 an odd bug with multiple mapper initialization that I haven't been able to
 reproduce yet, so any information will help.  Database configuration,
 number of processes, etc.
 On Oct 7, 2014 11:46 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:

 Dear all

 Maybe one of you can shed some light on this error message that I see in
 the log file for one of my handler processes. I get about one of them
 per second. The effect is that most of the jobs remain in the waiting
 to run stage.

 The postgres database is running on a separate server and appear to be
 doing just fine.

 Any help is greatly appreciated.

 Thanks
 Ulf

 ---

 galaxy.jobs.handler ERROR 2014-10-07 10:32:24,676 Exception in monitor_step
 Traceback (most recent call last):
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 161, in __monitor
   self.__monitor_step()
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/lib/galaxy/jobs/handler.py,
 line 184, in __monitor_step
   hda_not_ready =
 self.sa_session.query(model.Job.id).enable_eagerloads(False) \
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/scoping.py,
 line 114, in do
   return getattr(self.registry(), name)(*args, **kwargs)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py,
 line 1088, in query
   return self._query_cls(entities, self, **kwargs)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 108, in __init__
   self._set_entities(entities)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 117, in _set_entities
   self._setup_aliasizers(self._entities)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/query.py,
 line 132, in _setup_aliasizers
   _entity_info(entity)
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/util.py,
 line 578, in _entity_info
   mapperlib.configure_mappers()
 File

 /phengs/hpc_storage/home/galaxy_hpc/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/mapper.py,
 line 2260, in configure_mappers
   raise e
 InvalidRequestError: One or more mappers failed to initialize - can't
 proceed with initialization of other mappers.  Original exception was:
 When initializing mapper Mapper|Queue|kombu_queue, expression 'Message'
 failed to locate a name (name 'Message' is not defined). If this is a
 class name, consider adding this relationship() to the class
 'kombu.transport.sqlalchemy.Queue' class after both dependent classes
 have been defined.

 ---

 **
 The information contained in the EMail and any attachments is confidential
 and intended solely and for the attention and use of the named
 addressee(s). It may not be disclosed to any other person without the
 express authority of Public Health England, or the intended recipient, or
 both. If you are not the intended recipient, you must not disclose, copy,
 distribute or retain this message or any part of it. This footnote also
 confirms that this EMail has been swept for computer viruses by
 Symantec.Cloud, but please re-sweep any attachments before opening or
 saving. http://www.gov.uk/PHE
 **

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/



**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms

[galaxy-dev] find bam index files

2014-09-15 Thread Ulf Schaefer
Dear all

For each bam file in my history I can download the associated bai (bam 
index) file.

I assume these files are stored somewhere under 
/mount/galaxy/database/files/_metadata_files. Correct? Is there an easy 
way to find the bam index file for a bam file, given only the internal 
file name of the bam (e.g. 
/mont/galaxy/database/files/089/dataset_89231.dat)?

I am asking because I would like to use the files_to_ftp tool to 
automatically download bams together with their associated indices.

Thanks
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-30 Thread Ulf Schaefer
Dear Nate, dear Peter

Again, sorry for the delay in replying.

Yes I can. It looks like this

[galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat
[galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/
blastdb.nhd  blastdb.nhi  blastdb.nhr  blastdb.nin  blastdb.nog 
blastdb.nsd  blastdb.nsi  blastdb.nsq

I think the simplest solution would be to put something in the primary 
file. Just a short string that gets the file size above 0.

I personally have followed you initial suggestion and made the dbs 
available globally via the .loc file.

Thanks again
Ulf


On 28/07/14 09:43, Peter Cock wrote:
 On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Nate, dear Peter

 Sorry for the delay in replying.

 I can import both HTML and blastdb from a history to a data library. If
 I try to get the data out of the library into anothre history, I am
 successful for the html but not for the blastdb. The problem seems to be
 that the primary data file (the /path/dataset_12345.dat) is empty for
 the blastdb, while the html primary file has something in it.

 OK. Can you tell where Galaxy thinks the library files are on disk,
 and check to see if the folder of BLAST database files is actually
 there?

 When I try to import the blastdb (from library to history) there is a
 message along the lines of can't import empty file. I hypothesise
 (admittedly without having looked at a line of code) that there is a
 test for file size 0 somewhere that is either altogether unnecessary or,
 more likely, does not take into account that for composite datatypes it
 might be completely legitimate for the primary file to be empty.

 This guess makes sense - but I've not yet tried to trace through
 the code either.

 Or is my primary blastdb file not supposed to be empty in the first
 place? I can blast against it just fine.

 The BLAST databases do not define/populate a primary file, so
 Galaxy seems to create a dummy empty file on its own. I have
 wondered about altering the BLAST database datatype definition
 to have a human readable text file as the primary file (i.e. the
 information currently saved as a text log file when creating a
 database).

 Thanks a lot for your help
 Ulf

 You too - you've found an interesting bug...

 Peter


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-28 Thread Ulf Schaefer
Dear Nate, dear Peter

Sorry for the delay in replying.

I can import both HTML and blastdb from a history to a data library. If 
I try to get the data out of the library into anothre history, I am 
successful for the html but not for the blastdb. The problem seems to be 
that the primary data file (the /path/dataset_12345.dat) is empty for 
the blastdb, while the html primary file has something in it.

When I try to import the blastdb (from library to history) there is a 
message along the lines of can't import empty file. I hypothesise 
(admittedly without having looked at a line of code) that there is a 
test for file size 0 somewhere that is either altogether unnecessary or, 
more likely, does not take into account that for composite datatypes it 
might be completely legitimate for the primary file to be empty.

Or is my primary blastdb file not supposed to be empty in the first 
place? I can blast against it just fine.

Thanks a lot for your help
Ulf

On 24/07/14 15:02, Peter Cock wrote:
 On Thu, Jul 24, 2014 at 2:50 PM, Nate Coraor n...@bx.psu.edu wrote:
 On Jul 23, 2014, at 6:42 AM, Peter Cock p.j.a.c...@googlemail.com wrote:

 Interesting hypothesis - you may well be right.

 Galaxy guys - who is the expert to talk to on this and/or where
 in the code should we be looking?

 Thanks,

 Peter

 I think there's a bit of a mixup here - Peter, I believe you were asking
 if other composite types with an html primary dataset could be imported
 from the history to library, but Ulf, your test was the other direction
 (library-history). I'd be interested in knowing the outcome of the
 history-library test as well.

 Good catch - yes, that was what I was asking about. Ulf?

 I am woefully ignorant about the blastdbn datatype. Is the primary
 file supposed to be html type but empty?

 The BLAST databases are 'basic' composite datatypes, of which
 the most commonly used example is HTML (and some bits of
 the base class code code seem to assume HTML). This means
 testing if something works with HTML is a good first step.

 https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes

 Peter


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Providing BLAST db in a data library

2014-07-23 Thread Ulf Schaefer
Dear all

I have several smallish BLAST databases that I would like to provide in 
a data library. I create them in a history with the makeblastdb tool and 
them try to add them to the library. I see that for each blast db there 
is an empty file created (like /path/dataset_12345.dat) and a folder 
with the same name (/path/dataset_12345_files/) that contains the actual 
db files (blastdb.n*).

In my library the blastdb shows up empty and I cannot import it back to 
another history. I does not seem to be aware of the _files folder, 
despite it being the right data type (blastdbn).

Any ideas what I am doing wrong?

Thanks a lot for your help
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-23 Thread Ulf Schaefer
Dear Peter

Thanks for your reply.

I can import an html report (e.g. FastQC output) successfully into a new 
history from a data library. But the .dat file for the html is not empty 
like the one for the blastdb. Makes me think that I could do this with a 
blast db as well, if only it would not check for size 0 at the time of 
importing it.

Thanks
Ulf

On 23/07/14 10:56, Peter Cock wrote:
 On Wed, Jul 23, 2014 at 10:47 AM, Ulf Schaefer ulf.schae...@phe.gov.uk 
 wrote:
 Dear all

 I have several smallish BLAST databases that I would like to provide in
 a data library. I create them in a history with the makeblastdb tool and
 them try to add them to the library. I see that for each blast db there
 is an empty file created (like /path/dataset_12345.dat) and a folder
 with the same name (/path/dataset_12345_files/) that contains the actual
 db files (blastdb.n*).

 In my library the blastdb shows up empty and I cannot import it back to
 another history. I does not seem to be aware of the _files folder,
 despite it being the right data type (blastdbn).

 Any ideas what I am doing wrong?

 Thanks a lot for your help
 Ulf

 Hi Ulf,

 I've never tried that. It could be a bug in Galaxy importing
 composite datatypes into a library, or something in the BLAST
 database definition which needs fixing. Does importing an
 HTML report (with child files like images) into a library work
 for you? (This is another composite datatype so a useful
 comparison).

 Rather than using Data Libraries, we just list all the locally
 installed shared BLAST databases via the BLAST *.loc
 files instead.

 Note using the *.loc files makes the databases available to
 all the Galaxy users, while with a Data Library you can
 control access to specific groups/roles.

 Regards,

 Peter


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Rename output from a repeat

2014-07-18 Thread Ulf Schaefer
Hi all

We frequently use the syntax below to rename outputs of workflows that 
we run in batch. It is convenient to have sample names from fastqs 
carried over to sams, bams, vcfs, etc.

#{input1 | basename}.bam

This does not seem to be working for inputs that are in repeats, e.g. 
the VelvetOptimiser. Does anybody know if there is a syntax to make this 
work, maybe

#{repeatname[0].input1 | basename}.bam ?

Thanks a lot for your help
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Limits for enumerate for multiple input files

2014-04-22 Thread Ulf Schaefer
Dear all

I am using this control to allow the user to input multiple files:

param name=input_vcfs type=data multiple=true format=vcf 
label=Input VCF file(s) /

and I am using this for loop in the cheetah code to access the control:
command interpreter=bash
script.sh
...
#for $i, $input_vcf in enumerate( $input_vcfs ):
 ${input_vcf},
#end for
/command

It appears that when a user selects many files (25 in this case) the 
bash command in the command tag never gets executed. Therefore the job 
is never queued. The history item shows 'Waiting to run' indefinitely. 
Calling the script.sh manually with 25 input files works fine.

Any hint as to how to debug this would be greatly appreciated.

Thanks a lot
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Limits for enumerate for multiple input files

2014-04-22 Thread Ulf Schaefer
Hi Peter

I removed the unnecessary code.

If I run the tool with just a couple of inputs I see entries in the log 
files either from galaxy.jobs.runners.drmaa or from 
galaxy.jobs.runners.local that the job is being dispatched as normal.

Unfortunately there is no sign of the job in the log files when using 
more input files.

The command line that is supposed to be run is:

bash home/galaxy/galaxy-dist/tools/vcf_processing/vcf_to_fasta.sh 
/galaxy/database/files/042/dataset_42275.dat 40 10 50 0 40 0.9 20 
/galaxy/database/files/041/dataset_41720.dat, 
/galaxy/database/files/041/dataset_41980.dat,

the first dat file being the output and the ones at the end being a 
comma separated list of the input files. On the command line this 
command works with much longer input files lists.

Any ideas? Or is there a better practice to pass a large number of input 
files to a bash script?

Thanks
Ulf

On 22/04/14 15:26, Peter Cock wrote:
 On Tue, Apr 22, 2014 at 3:02 PM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear all

 I am using this control to allow the user to input multiple files:

 param name=input_vcfs type=data multiple=true format=vcf
 label=Input VCF file(s) /

 and I am using this for loop in the cheetah code to access the control:
 command interpreter=bash
 script.sh
 ...
 #for $i, $input_vcf in enumerate( $input_vcfs ):
   ${input_vcf},
 #end for
 /command

 The $i and enumerate seem unnecessary here.

 It appears that when a user selects many files (25 in this case) the
 bash command in the command tag never gets executed. Therefore the job
 is never queued. The history item shows 'Waiting to run' indefinitely.
 Calling the script.sh manually with 25 input files works fine.

 Any hint as to how to debug this would be greatly appreciated.

 Thanks a lot
 Ulf

 Can you see anything in the log about the job, and in particular
 the command line it would attempt to run?

 Peter


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] sort list of ftp uploaded files

2013-12-18 Thread Ulf Schaefer
Dear John, dear Curtis, and all

all that is required to fix my current issue is that the files show up 
in the 'Upload File' tool form in alphabetical order.

I assume that at some point an 'ls' command is done on the users' ftp 
home folder. The result of that could possibly be stored in a hash of 
some sort, which could explain why the order is randomised when the 
files come back out to go on the form. But I might be wrong.

Of course a better but more involved solution would be if the 'Files 
uploaded via FTP' table was sortable, e.g. by clicking on the table headers.

Any idea where I might find the code that populates this table?

Thanks a lot for your help.

Cheers
Ulf


On 17/12/13 22:31, Curtis Hendrickson (Campus) wrote:
 John,

 I can't speak for Ulf, but a more general solution would be to allow sorting 
 by a standard unix SORT set keys.
 That would allow things like - sort by the sample ID after the 3rd dash, then 
 by the read direction.
   -t - -k1,1n -k2,2r # http://unixhelp.ed.ac.uk/CGI/man-cgi?sort

 Regards,
 Curtis

 PS - granted simple filename sort is a good place to start and  a lot better 
 than nothing. Date sort is often useful, too.


 -Original Message-
 From: galaxy-dev-boun...@lists.bx.psu.edu 
 [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of John Chilton
 Sent: Tuesday, December 17, 2013 4:03 PM
 To: Ulf Schaefer
 Cc: galaxy-dev@lists.bx.psu.edu
 Subject: Re: [galaxy-dev] sort list of ftp uploaded files

 If Galaxy just sorted these files alphabetically before display/import would 
 that fix your problem? Or do your users need to be able to modify the order?

 -John

 On Mon, Dec 16, 2013 at 9:19 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear all

 Is there a way to sort the list of files that a user has uploaded via
 FTP? Currently they appear on the form of the tool 'Upload File' in
 random order it seems. There is not much documentation on the param
 type=ftpfile / tag unfortunately.

 If the user just selects all files they end up in his/her history in
 the same random order, which causes a slight inconvenience when using
 these files as paired multiple input for a workflow.

 The obvious work-around is to put the files into the history one pair
 at a time, but this becomes a bit onerous for larger number of files.

 Thanks a lot for your help.

 Cheers
 Ulf

 **
  The information contained in the EMail and any attachments is
 confidential and intended solely and for the attention and use of the
 named addressee(s). It may not be disclosed to any other person
 without the express authority of Public Health England, or the
 intended recipient, or both. If you are not the intended recipient,
 you must not disclose, copy, distribute or retain this message or any
 part of it. This footnote also confirms that this EMail has been swept
 for computer viruses by Symantec.Cloud, but please re-sweep any
 attachments before opening or saving. http://www.gov.uk/PHE
 **
 

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this and other
 Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this and other Galaxy 
 lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] sort list of ftp uploaded files

2013-12-18 Thread Ulf Schaefer
Dear John

Thanks very much. This is the perfect (minimal) solution that I need 
right now.

I am looking forward to your work on pairs of data. It sounds like it 
will be exceptionally useful for us.

Thanks again and best wishes
Ulf

On 18/12/13 14:29, John Chilton wrote:
 Couple things...

 I just pushed a commit to galaxy-central to sort those files by
 default - that is a clear improvement:

 https://bitbucket.org/galaxy/galaxy-central/commits/ce186eb5fefcb7ff332c73bd0869b9e93d0b

 Clearly not enough though - it would be nice to have for more advanced
 sorting options though so I have created a Trello card -

 https://trello.com/c/0hgUrW4H

 If you really wanted to dig into this there are many files that may be
 relevant -

 templates/embed_base.mako
 lib/galaxy/web/form_builder.py
 lib/galaxy/tools/parameters/basic.py
 lib/galaxy/web/framework/helpers/grids.py

 Hopefully this helps.

 Also just as a heads up, coming from a different direction me and
 Martin are actively working building abstractions for pairs of data
 and that will have to include a nice UI for building these pairs - our
 plan is to start with library imports but we will try to architect the
 creation widget to be able to target history items as well. This would
 largely negate the need to import them in any particular order.

 This will likely not be in the next release, but hopefully the
 following one (I can build pairs via the API now, but I want to be
 able to do useful stuff with them before committing :) ).

 -John

 On Wed, Dec 18, 2013 at 2:37 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear John, dear Curtis, and all

 all that is required to fix my current issue is that the files show up
 in the 'Upload File' tool form in alphabetical order.

 I assume that at some point an 'ls' command is done on the users' ftp
 home folder. The result of that could possibly be stored in a hash of
 some sort, which could explain why the order is randomised when the
 files come back out to go on the form. But I might be wrong.

 Of course a better but more involved solution would be if the 'Files
 uploaded via FTP' table was sortable, e.g. by clicking on the table headers.

 Any idea where I might find the code that populates this table?

 Thanks a lot for your help.

 Cheers
 Ulf


 On 17/12/13 22:31, Curtis Hendrickson (Campus) wrote:
 John,

 I can't speak for Ulf, but a more general solution would be to allow 
 sorting by a standard unix SORT set keys.
 That would allow things like - sort by the sample ID after the 3rd dash, 
 then by the read direction.
-t - -k1,1n -k2,2r # http://unixhelp.ed.ac.uk/CGI/man-cgi?sort

 Regards,
 Curtis

 PS - granted simple filename sort is a good place to start and  a lot 
 better than nothing. Date sort is often useful, too.


 -Original Message-
 From: galaxy-dev-boun...@lists.bx.psu.edu 
 [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of John Chilton
 Sent: Tuesday, December 17, 2013 4:03 PM
 To: Ulf Schaefer
 Cc: galaxy-dev@lists.bx.psu.edu
 Subject: Re: [galaxy-dev] sort list of ftp uploaded files

 If Galaxy just sorted these files alphabetically before display/import 
 would that fix your problem? Or do your users need to be able to modify the 
 order?

 -John

 On Mon, Dec 16, 2013 at 9:19 AM, Ulf Schaefer ulf.schae...@phe.gov.uk 
 wrote:
 Dear all

 Is there a way to sort the list of files that a user has uploaded via
 FTP? Currently they appear on the form of the tool 'Upload File' in
 random order it seems. There is not much documentation on the param
 type=ftpfile / tag unfortunately.

 If the user just selects all files they end up in his/her history in
 the same random order, which causes a slight inconvenience when using
 these files as paired multiple input for a workflow.

 The obvious work-around is to put the files into the history one pair
 at a time, but this becomes a bit onerous for larger number of files.

 Thanks a lot for your help.

 Cheers
 Ulf

 **
  The information contained in the EMail and any attachments is
 confidential and intended solely and for the attention and use of the
 named addressee(s). It may not be disclosed to any other person
 without the express authority of Public Health England, or the
 intended recipient, or both. If you are not the intended recipient,
 you must not disclose, copy, distribute or retain this message or any
 part of it. This footnote also confirms that this EMail has been swept
 for computer viruses by Symantec.Cloud, but please re-sweep any
 attachments before opening or saving. http://www.gov.uk/PHE
 **
 

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this and other
 Galaxy lists, please use the interface at:
 http

[galaxy-dev] sort list of ftp uploaded files

2013-12-16 Thread Ulf Schaefer
Dear all

Is there a way to sort the list of files that a user has uploaded via 
FTP? Currently they appear on the form of the tool 'Upload File' in 
random order it seems. There is not much documentation on the param 
type=ftpfile / tag unfortunately.

If the user just selects all files they end up in his/her history in the 
same random order, which causes a slight inconvenience when using these 
files as paired multiple input for a workflow.

The obvious work-around is to put the files into the history one pair at 
a time, but this becomes a bit onerous for larger number of files.

Thanks a lot for your help.

Cheers
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] workflow batch execution dynamic parameters

2013-10-02 Thread Ulf Schaefer
Dear all

Related to my question yesterday I have another one:

If I run a workflow in batch I can select pairs of files (the old fwd 
and rev reads fastq files) and the workflow is run however many times, 
depending on the number of pairs I select.

Unfortunately all other parameters appear to be static, which is not 
ideal in my case. Specifically I am using bwa the map a large-ish number 
of (paired) samples to the same reference genome, while simultaneously 
setting read group information in the resulting sam files. So it's 
suboptimal to run this with a fixed SM value.

Is there any way of dynamically setting the parameters? The 
set-at-runtime option also allows for a single parameter value, not a 
list. Ideally the set-at-runtime option would somehow allow me to input 
one parameter per file pair.

Thanks very much in advance for your help.

Cheers
Ulf


On 01/10/13 14:09, Joachim Jacob | VIB | wrote:
 Sorry, my answer doesn't fit your question. :-)
 J

 Joachim Jacob
 Contact details: http://www.bits.vib.be/index.php/about/80-team


 On 10/01/2013 03:08 PM, Joachim Jacob | VIB | wrote:
 Hi Ulf,

 What I do:
 1. make a history, doing the steps you want to do on one input file
 2. create a workflow of that history
 3. assemble all input files in one history
 4. run the workflow and select the multiple input files to run the
 workflow on.
 5. Optionally: send the results to a new history, for every input file
 you will get a new history, properly named.

 Hope this helps,

 Joachim

 Joachim Jacob
 Contact details: http://www.bits.vib.be/index.php/about/80-team


 On 10/01/2013 02:27 PM, Ulf Schaefer wrote:
 Dear all

 We frequently find ourselves in situations where a tool needs to be run
 with a lot of input files. For example, run the GATK UnifiedGenotyper
 with easily dozens of bam files.

 Using the repeat in this case requires quite a bit of clicking. Is
 there a more conventient way fo doing this? Maybe similar to the
 multi-file-select that is possible for workflow inputs?

 I saw some older discussions on this or similar issues, but I am a bit
 lost what the current official stable proposed solution for this is.

 Thanks for your help
 Ulf

 **

 The information contained in the EMail and any attachments is
 confidential and intended solely and for the attention and use of the
 named addressee(s). It may not be disclosed to any other person
 without the express authority of Public Health England, or the
 intended recipient, or both. If you are not the intended recipient,
 you must not disclose, copy, distribute or retain this message or any
 part of it. This footnote also confirms that this EMail has been
 swept for computer viruses by Symantec.Cloud, but please re-sweep any
 attachments before opening or saving. http://www.gov.uk/PHE
 **


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/





**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] multiple input files

2013-10-01 Thread Ulf Schaefer
Dear all

We frequently find ourselves in situations where a tool needs to be run 
with a lot of input files. For example, run the GATK UnifiedGenotyper 
with easily dozens of bam files.

Using the repeat in this case requires quite a bit of clicking. Is 
there a more conventient way fo doing this? Maybe similar to the 
multi-file-select that is possible for workflow inputs?

I saw some older discussions on this or similar issues, but I am a bit 
lost what the current official stable proposed solution for this is.

Thanks for your help
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] multiple input files

2013-10-01 Thread Ulf Schaefer
Dear Peter, dear John, dear all

Thanks very much. This is exactly what I need and the code changes 
proposed work perfectly as they are. At a short glance even the 
validators work as intended.

Now for the optional special bonus:

Is there a way to define the size of the boxes? The ones I see have 4 
lines. Can I make them resizable somehow or define a larger size? 
(size=10 or resize=true does not seem to do the trick.) Not a very 
pressing issue though.

@Joachim: I am using the same approach elsewhere, but the difference is 
that it runs a tool multiple times, instead of running it once on 
multiple inputs. But thanks anyway.

Cheers
Ulf


On 01/10/13 14:08, John Chilton wrote:
 Thanks Peter,

Yeah, looking over unified_genotyper.xml, you would probably just
 want to replace:

 #for $i, $input_bam in enumerate( $reference_source.input_bams ):
 -d -I ${input_bam.input_bam} ${input_bam.input_bam.ext}
 gatk_input_${i}
 #if str( $input_bam.input_bam.metadata.bam_index ) != None:
 -d  ${input_bam.input_bam.metadata.bam_index}
 bam_index gatk_input_${i} ##hardcode galaxy ext type as bam_index
 #end if
 #end for

 with

 #for $i, $input_bam in enumerate( $reference_source.input_bams ):
 -d -I ${input_bam} ${input_bam.ext} gatk_input_${i}
 #if str( $input_bam.metadata.bam_index ) != None:
 -d  ${input_bam.metadata.bam_index} bam_index
 gatk_input_${i} ##hardcode galaxy ext type as bam_index
 #end if
 #end for

 And:

  repeat name=input_bams title=BAM file min=1
 help=-I,--input_file amp;lt;input_fileamp;gt;
  param name=input_bam type=data format=bam label=BAM 
 file
validator type=unspecified_build /
validator type=dataset_metadata_in_data_table
 table_name=gatk_picard_indexes metadata_name=dbkey
 metadata_column=dbkey message=Sequences are not currently available
 for the specified build. / !-- fixme!!! this needs to be a select
 --
  /param
  /repeat

 with

  param name=input_bams type=data multiple=true
 format=bam label=BAM file
validator type=unspecified_build /
validator type=dataset_metadata_in_data_table
 table_name=gatk_picard_indexes metadata_name=dbkey
 metadata_column=dbkey message=Sequences are not currently available
 for the specified build. / !-- fixme!!! this needs to be a select
 --
  /param


 And:

  repeat name=input_bams title=BAM file min=1
 help=-I,--input_file amp;lt;input_fileamp;gt;
  param name=input_bam type=data format=bam label=BAM 
 file 
  /param
  /repeat

 with

  param name=input_bams multiple=true type=data
 format=bam label=BAM file 
  /param

 I have not tested these specific changes, so your millage may vary. I
 have no clue if those validators in that second block are going to
 work with multiple=true. If you test it out and there is some
 problem, please let me know I can try to fix it.

 I don't know if the Galaxy team wants to start replacing these blocks
 in its tools - this change would break existing workflows built on the
 unified genotyper and I am not sure the Galaxy team has any interest
 in using these kind of data blocks going forward. They are working out
 great for us on Galaxy-P though.

 -John


 On Tue, Oct 1, 2013 at 7:57 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Tue, Oct 1, 2013 at 1:27 PM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear all

 We frequently find ourselves in situations where a tool needs to be run
 with a lot of input files. For example, run the GATK UnifiedGenotyper
 with easily dozens of bam files.

 Using the repeat in this case requires quite a bit of clicking. Is
 there a more conventient way fo doing this? Maybe similar to the
 multi-file-select that is possible for workflow inputs?

 I saw some older discussions on this or similar issues, but I am a bit
 lost what the current official stable proposed solution for this is.

 Thanks for your help
 Ulf

 Use param type=data multiple=true ...  instead of using
 repeat ...param type=data .../repeat.

 I asked John Chilton about this recently via Twitter, and it is
 now on the wiki,
 http://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax

 Simple example here:
 https://bitbucket.org/galaxyp/galaxyp-toolshed-iquant/src/tip/iquant.xml?at=default

 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

**
The information contained in the EMail and any attachments is confidential and 
intended solely

Re: [galaxy-dev] Visualisation of VCF files in trackster

2013-09-12 Thread Ulf Schaefer
Dear Jeremy

Thank you for your reply and sorry for not being clear. In short I 
solved the problem. Below is some info, in case this is useful for 
someone else.

Thanks for your help

The situation was:

On Main:
Visualisation of the SAM/BAM file - OK
Visualisation of the VCF file - OK

On my local install:
Visualisation of the SAM/BAM file - OK
Visualisation of the VCF file - FAIL

The reason is that this command fails:

grep -v '^#' /data/database/files/000/dataset_596.dat | sort -k1,1 | 
bedtools genomecov -bg -split -i stdin -g 
/data/database/files/000/dataset_598.dat  temp.bg ; bedGraphToBigWig 
temp.bg /data/database/files/000/dataset_598.dat 
/data/database/files/000/dataset_609.dat

with Input error: found interval with block-counts not matching 
starts/sizes

Where dataset_596.dat is my vcf and 
/data/database/files/000/dataset_598.dat is my genome file.

This is produced by the bedtools genomecov bit of the command, which 
appears to have some sort of problem with the vcf input in combination 
with the -split option. The problem disappears with the installation of 
the latest version of bedtools (v2.17.0), but if you are using the 
version that you get from yum (v2.15.0) you run into this error.

Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] phylib file type

2013-09-11 Thread Ulf Schaefer
Hello all

Has anyone successfully integrated the phylib file type into Galaxy in 
their local install? It's a standard file type for multiple sequence 
alignment and I could not possibly be the only one who want to support 
it in a local install.

I saw the instructions on the wiki for adding new file types. I would 
guess that in addition to the changes in the datatypes_conf.xml I would 
have to derive a new class in galaxy.datatypes.sequence from the 
existing Alignment class.

Is there a commit where this is already included or will I actually have 
to code it myself? How can I find out?

Thanks a lot
Ulf

PS.: hg tip
changeset:   10201:ebe87051fadf
tag: tip
parent:  10199:8bf64d933704
user:Dannon Baker dannonba...@me.com
date:Tue Jul 02 10:48:31 2013 -0400
summary: Fix two more downgrade invocations to accept the 
migrate_engine parameter

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] phylib file type

2013-09-11 Thread Ulf Schaefer
Dear Peter

Thank you for your email.

Yes. This is the file type I mean. Sorry for the typo.

Any chance there is a pretty class with appropriate sniff functions etc. 
somewhere? Currently it looks a little bit like a hack on my side.

Thanks a lot for your help
Ulf

On 11/09/13 10:16, Peter Cock wrote:
 On Wed, Sep 11, 2013 at 9:46 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Hello all

 Has anyone successfully integrated the phylib file type into Galaxy in
 their local install? It's a standard file type for multiple sequence
 alignment and I could not possibly be the only one who want to support
 it in a local install.

 Do you mean phylip with a p at the end? If so try this:
 http://toolshed.g2.bx.psu.edu/view/devteam/emboss_datatypes
 http://emboss.sourceforge.net/docs/themes/SequenceFormats.html

 Note however there are both strict and relaxed interpretations
 of the PHYLIP standard (the original version imposed taxon
 name limits), and also both interlaced/interleaved and
 sequential forms.

 The EMBOSS repository has phylipnon for non-interleaved
 (i.e. sequential) PHYLIP format , and phylip for either.

 Peter


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Visualisation of VCF files in trackster

2013-09-11 Thread Ulf Schaefer
Dear all

My attempts to visualise a vcf file in trackster fail. I can visualise 
the corresponding SAM/BAM files fine. I can visualise the same file on 
Galaxy main too, but not in my local install.

I receive the following error message:

Trackster Error
Input error: found interval with block-counts not matching starts/sizes 
on line.
sort: write failed: standard output: Broken pipe
sort: write error
needLargeMem: trying to allocate 0 bytes (limit: 1000)

The vcf format is in version 4.1. I am creating it from the sorted BAM 
file with samtools mpileup in the most basic way.

What am I missing?

Thanks a lot for your help
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] disable PBKDF2, revert to SHA1

2013-08-19 Thread Ulf Schaefer
Hello all

It was indicated in several recent posts (e.g. there 
http://dev.list.galaxyproject.org/support-pbkdf2-in-proftpd-1-3-5rc3-td4660836.html)
 
that for Galaxy to reliably work with ProFTPD, it is recommended to 
disable the new PBKDF2 encryption of passwords and revert back to SHA1.

Unfortunately it appears to be beyond me to figure out how to do this. 
Could somebody point me in the right direction, please?

Thanks a lot
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/