[galaxy-dev] dynamically send jobs to second cluster on high load

2012-09-24 Thread Geert Vandeweyer

Hi,

The admin pages state that it is possible to specify multiple clusters 
in the universe file. Currently, we are investigating if we can couple 
the university HPC platform to galaxy, to handle usage peaks. It would 
be ideal if the job manager would check the load of the dedicated 
cluster (eg queue length) and send jobs to the second cluster when load 
is above a threshold.


Does such an approach exists already, or will it become available in the 
near future? As far as I understand, it is now only possible to specify 
which jobs run on which cluster, without dynamic switching?


Best regards,

Geert

--

Geert Vandeweyer, Ph.D.
Department of Medical Genetics
University of Antwerp
Prins Boudewijnlaan 43
2650 Edegem
Belgium
Tel: +32 (0)3 275 97 56
E-mail: geert.vandewe...@ua.ac.be
http://ua.ac.be/cognitivegenetics
http://www.linkedin.com/pub/geert-vandeweyer/26/457/726

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] The user creation and login script can be injected with executable javascript in Galaxy

2012-09-24 Thread Dannon Baker
Hanfei,

I'd be happy to take a look at the report and share it with the rest of the 
team if you'd like to send it directly to me.

Regarding SSL, this is definitely something that you can set up for your own 
instance, see the documentation for configuring proxies on the wiki 
http://wiki.g2.bx.psu.edu/Admin/Config/Performance/nginx%20Proxy.

Thanks!

-Dannon

On Sep 24, 2012, at 12:01 AM, Hanfei Sun ad9...@gmail.com wrote:

 Hello Galaxy-team,
 
 A galaxy instance is being hold on our server. 
 But last week, an expert in security makes some tests on our server. He 
 warned us that the user creation and login script can be injected with 
 executable javascript in Galaxy, which may make our server vulnerable.
 
 He gives us a report of 3 pages (other issues including Non-SSL Password and 
 cookie of Galaxy). 
 We don't know whether it's serious and whether we need to fix these issues 
 immediately. 
 Is Galaxy going to update for issues? Or we need to modify them ourselves? 
 Any suggestion is appreciated.
 Thanks!
 
 
 -- 
 Hanfei Sun
 Sent with Sparrow
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Error when running cleanup_datasets.py

2012-09-24 Thread Liisa Koski
Hello,
I am trying to run the cleanup scripts on my local installation but get 
stuck when trying to run the following:

./scripts/cleanup_datasets/cleanup_datasets.py universe_wsgi.ini -d 10 -5 
-r

Deleting library dataset id  7225
Traceback (most recent call last):
  File ./scripts/cleanup_datasets/cleanup_datasets.py, line 524, in 
module
if __name__ == __main__: main()
  File ./scripts/cleanup_datasets/cleanup_datasets.py, line 124, in main
purge_folders( app, cutoff_time, options.remove_from_disk, info_only = 
options.info_only, force_retry = options.force_retry )
  File ./scripts/cleanup_datasets/cleanup_datasets.py, line 247, in 
purge_folders
_purge_folder( folder, app, remove_from_disk, info_only = info_only )
  File ./scripts/cleanup_datasets/cleanup_datasets.py, line 497, in 
_purge_folder
_purge_folder( sub_folder, app, remove_from_disk, info_only = 
info_only )
  File ./scripts/cleanup_datasets/cleanup_datasets.py, line 497, in 
_purge_folder
_purge_folder( sub_folder, app, remove_from_disk, info_only = 
info_only )
  File ./scripts/cleanup_datasets/cleanup_datasets.py, line 495, in 
_purge_folder
_purge_dataset_instance( ldda, app, remove_from_disk, info_only = 
info_only ) #mark a DatasetInstance as deleted, clear associated files, 
and mark the Dataset as deleted if it is deletable
  File ./scripts/cleanup_datasets/cleanup_datasets.py, line 376, in 
_purge_dataset_instance
( dataset_instance.__class__.__name__, dataset_instance.id, 
dataset_instance.dataset.id )
AttributeError: 'NoneType' object has no attribute 'id'


Any help would be much appreciated.

Thanks,
Liisa
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Can't edit Galaxy Workflow _ElementInterface instance has no attribute 'render'

2012-09-24 Thread Liisa Koski
Hello,
After updating to the Sept. 07 distribution I am having problems editing 
an existing workflow.


Server error
URL: 
http:galaxy_url/workflow/load_workflow?id=ba751ee0539fff04_=1348501448807
Module paste.exceptions.errormiddleware:143 in __call__
  app_iter = self.application(environ, start_response)
Module paste.debug.prints:98 in __call__
  environ, self.app)
Module paste.wsgilib:539 in intercept_output
  app_iter = application(environ, replacement_start_response)
Module paste.recursive:80 in __call__
  return self.application(environ, start_response)
Module paste.httpexceptions:632 in __call__
  return self.application(environ, start_response)
Module galaxy.web.framework.base:160 in __call__
  body = method( trans, **kwargs )
Module galaxy.web.framework:69 in decorator
  return simplejson.dumps( func( self, trans, *args, **kwargs ) )
Module galaxy.web.controllers.workflow:735 in load_workflow
  'tooltip': module.get_tooltip( static_path=url_for( '/static' ) ),
Module galaxy.workflow.modules:262 in get_tooltip
  return self.tool.help.render( static_path=static_path )
AttributeError: _ElementInterface instance has no attribute 'render'

Any help would be much appreciated.

Thanks in advance,
Liisa

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] python egg cache exists error

2012-09-24 Thread Nate Coraor
For Test/Main, I have the user's ~/.bash_profile set $PYTHON_EGG_CACHE on a 
per-node basis.  This could also be done per-node and per-pty to ensure 
uniqueness per job.

--nate

On Sep 18, 2012, at 11:24 AM, James Taylor wrote:

 Interesting. If I'm reading this correctly the problem is happening
 inside pkg_resources? (galaxy.eggs unzips eggs, but I think it does so
 on install [fetch_eggs] time not run time which would avoid this). If
 so this would seem to be a locking bug in pkg_resources. Dannon, we
 could put a guard around the imports in extract_dataset_part.py as an
 (overly aggressive and hacky) fix.
 
 -- jt
 
 
 On Tue, Sep 18, 2012 at 10:37 AM, Jorrit Boekel
 jorrit.boe...@scilifelab.se wrote:
 - which lead to unzipping .so libraries from python eggs into the nodes'
 /home/galaxy/.python-eggs
 - this runs into lib/pkg_resources.py and its _bypass_ensure_directory
 method that creates the temporary dir for the egg unzip
 - since there are 8 processes on the node, sometimes this method tries to
 mkdir a directory that was just made by the previous process after the
 isdir.
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] How to rotate Galaxy log file

2012-09-24 Thread Nate Coraor
On Sep 19, 2012, at 9:50 AM, Jennifer Jackson wrote:

 repost to galaxy-dev
 
 On 9/7/12 6:39 PM, Lukasz Lacinski wrote:
 Dear All,
 
 I use an init script that comes with Galaxy in the contrib/ subdirectory
 to start Galaxy. The log file
 
 --log-file /home/galaxy/galaxy.log
 
 specified in the script grows really quickly. How to logrotate the file?

Hi Lukasz,

I'd suggest using whatever log rotation utility is provided by your OS.  You'll 
need to restart the Galaxy process to begin writing to the new log once the old 
one has been rotated.

--nate

 
 Thanks,
 Lukasz
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:
 
  http://lists.bx.psu.edu/listinfo/galaxy-dev
 
 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:
 
  http://lists.bx.psu.edu/
 
 -- 
 Jennifer Jackson
 http://galaxyproject.org
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Automatic installation of third party dependancies

2012-09-24 Thread Greg Von Kuster
Hi Lance,


On Sep 21, 2012, at 6:04 PM, Lance Parsons wrote:

 OK, I was able to get a new version installed.  It seems there are two issues:
 
 1) New revisions with the same version ionvalidate previous revisions.  
 This means that Galaxy servers with the old, and now invalid, revisions are 
 not able to update the tool (nor install it again).

I'm not quite sure what you're stating here.  Do the following tool shed wiki 
page clarify the behavior you are seeing?

http://wiki.g2.bx.psu.edu/ToolShedRepositoryFeatures#Pushing_changes_to_a_repository_using_hg_from_the_command_line
http://wiki.g2.bx.psu.edu/RepositoryRevisions#Installable_repository_changeset_revisions


 
 2) Pushes from Mercurial (even version 2.3.3) do not seem to trigger metadata 
 refreshes in the tool shed, however, uploads of tar.gz files do.

I am not able to reproduce this behavior.  In my environment, metadata is 
always automatically generated for new changesets I push to my local tool shed 
(or the test tool shed) from the command line.

What is the result of typing the following in the environment from which you 
are pushing changes to the tool shed?

$hg --version

You should see something like the following, showing that you are running at 
least hg version 2.2.3.

gvk:/tmp/repos/convert_chars gvk$ hg --version
Mercurial Distributed SCM (version 2.2.3)
(see http://mercurial.selenic.com for more information)

Copyright (C) 2005-2012 Matt Mackall and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


 
 Hope this helps.  
 
 Lance
 
 Lance Parsons wrote:
 
 I've run into this issue again, and I'm having a hard time working around 
 it.  However, I have confirmed that at least some updates to a tool in the 
 tool shed will invalidate previously valid revisions and thus prevent 
 users from installing or updating the tool at all.
 
 For example, push version 0.1 of the tool and create a valid revision 
 1:xx.  Then install the tool in galaxy.  Make a small change (say to 
 tool_dependencies.xml) and push a new revision (but keep the tool version 
 the same), now at revision 2:xxx.  The tool shed will show 2:xx as 
 the only valid revision to install, but the galaxy system with revision 
 1:xx will be stuck, unable to get upgrades (Server Error described 
 previously).  
 
 I'm trying to work around this now with my htseq-count tool, but so far no 
 luck.  I've created a few spurious revisions in the attempt, and I think now 
 I may just try bumping the version (already did to no avail, toolshed still 
 thinks it's the same) and uploading a tar file.  That seems to more reliably 
 parse metadata.  Will let you know what, if anything, works.  Thanks.
 
 Lance
 
 Greg Von Kuster wrote:
 
 Hello Lance,
 
 I've just committed a fix for getting updates to installed tool shed 
 repositories in change set 7713:23107188eab8, which is currently available 
 only in the Galaxy central repository.  However, my fix will probably not 
 correct the issue you're describing, and I'm still not able to reproduce 
 this behavior.  See my inline comments...
 
 
 On Sep 13, 2012, at 4:41 PM, Lance Parsons wrote:
 
 Actually, I think that is exactly the issue.  I DO have 3:f7a5b54a8d4f 
 installed. I've run into a related issue before, but didn't fully 
 understand it.
 
 I believe what happened was:
 1) I pushed revision 3:f7a5b54a8d4f to the tool shed which contained the 
 first revision of version 0.2 of the htseq-count tool.
 2) I installed the htseq-count tool from the tool shed, getting revision 
 3:f7a5b54a8d4f
 3) I pushed an update to version 0.2 of the htseq-count tool. The only 
 changes were to tool-dependencies so I thought it would be safe to leave 
 the version number alone (perhaps this is problem?)
 
 
 You are correct in stating that the tool version number should not change 
 just because you've added a tool_dependencies.xml file.  This is definitely 
 not causing the behavior you're describing.
 
 
 4) I attempted to get updates and ran into the issue I described.
 
 I also ran into this (I believe it was with freebayes, but not sure) when 
 I removed (uninstalled) a particular revision of a tool. Then the tool was 
 updated. I went to install and and it said that I already had a previous 
 revision installed and should install that. However, I couldn't since the 
 tool shed won't allow installation of old revisions of the same version of 
 a tool.
 
 The following section of the tool shed wiki should provide the details 
 about why you are seeing this behavior.  Keep in mind that you will only 
 get certain updates to installed repositories from the tool shed.  This 
 behavior enables updates to installed tool versions.  To get a completely 
 new version of an installed tool (if one exists), you need to install a new 
 (different) changeset revision from the tool shed repository.
 
 

[galaxy-dev] When will the API allow setting of parameters (not inputs) from the API

2012-09-24 Thread Thon Deboer
Hi,

One of the biggest hurdles for the implementation in our institute is the 
inability of Galaxy API to set parameters at run time.
You can only seem to set inputs, but not parameters...

Is there any ETA on when this will be available? Is this even a priority?

Thanks!

Regards,

Thon de Boer, Ph.D.
Bioinformatics Guru
+1-650-799-6839
thondeb...@me.com
LinkedIn Profile




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] dynamically send jobs to second cluster on high load

2012-09-24 Thread John Chilton
Hello Geert,

I don't believe any such functionality is available out of the box,
but I am confident clever use of dynamic job runners
(http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010080.html)
could solve this problem.

One approach would be to maybe move all of your job runners out of
galaxy:tool_runners maybe to a new section called
galaxy:tool_runners_local and then create another set of runners for
your HPC resource (maybe galaxy:tool_runners_hpc).

Next set your default_cluster_job_runner to
dynamic:///python/default_runner and create a python function called
default_runner in lib/galaxy/jobs/rules/200_runners.py.

The outline of this file might be something like this:

from ConfigParser import ConfigParser

def default_runner(tool_id):
runner = None
if _local_queue_busy():
runner = _get_runner(galaxy:tool_runners_local, tool_id)
else:
runner = _get_runner(galaxy:tool_runners_hpc, tool_id)
if not runner:
runner = local:// # Or whatever default behavior you want.
return runner

def _local_queue_busy():
 # TODO: check local queue, would need to know more...

def _get_runner(runner_section, tool_id):
universe_config_file = universe_wsgi.ini
parser = ConfigParser()
parser.read(universe_config_file)
job_runner = None
if parser.has_option(runner_section, tool_id):
 job_runner = parser.get(runner_section, tool_id)
return job_runner

You could tweak the logic here to do stuff like only submit certain
kinds of jobs to the HPC resource or specify different default runners
for each location.

Hopefully this is helpful. If you want more help defining this file I
could fill in the details if I knew more precisely what behavior you
wanted for each queue and what the command line to determine if the
dedicated Galaxy resource is busy (or maybe just what queue manager
you are using if any).

Let me know if you go ahead and get this working, I am eager to hear
success stories.

-John


John Chilton
Senior Software Developer
University of Minnesota Supercomputing Institute
Office: 612-625-0917
Cell: 612-226-9223
Bitbucket: https://bitbucket.org/jmchilton
Github: https://github.com/jmchilton
Web: http://jmchilton.net

On Mon, Sep 24, 2012 at 3:55 AM, Geert Vandeweyer
geert.vandewey...@ua.ac.be wrote:
 Hi,

 The admin pages state that it is possible to specify multiple clusters in
 the universe file. Currently, we are investigating if we can couple the
 university HPC platform to galaxy, to handle usage peaks. It would be ideal
 if the job manager would check the load of the dedicated cluster (eg queue
 length) and send jobs to the second cluster when load is above a threshold.

 Does such an approach exists already, or will it become available in the
 near future? As far as I understand, it is now only possible to specify
 which jobs run on which cluster, without dynamic switching?

 Best regards,

 Geert

 --

 Geert Vandeweyer, Ph.D.
 Department of Medical Genetics
 University of Antwerp
 Prins Boudewijnlaan 43
 2650 Edegem
 Belgium
 Tel: +32 (0)3 275 97 56
 E-mail: geert.vandewe...@ua.ac.be
 http://ua.ac.be/cognitivegenetics
 http://www.linkedin.com/pub/geert-vandeweyer/26/457/726

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/