Re: [galaxy-dev] python job manager
On Wed, Mar 16, 2011 at 10:46:54AM -0400, James Lindsay wrote: > Hi, > I run galaxy on a large SMP university machine. The machine is used by some > folks for command line work, and others via galaxy. I was wondering if > anyone had integrated into galaxy a job manager that monitors CPU load > averages, and only runs new jobs when cpu resources are available? James, you could achieve this using the galaxy cluster configurations. Even with a single (SMP) machine in use there's value in creating (for example) a torque queue on that machine and using Galaxy's torque support to have it submit jobs to that queue. Your non-Galaxy command line users can then also use the 'qsub' command to launch their jobs, and Torque will be able to balance resources across them according to your preferences. Having your galaxy jobs use the pbs (torque) job runner has the additional benefit of being able to restart galaxy without the jobs losing their parent process and dying. https://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer https://bitbucket.org/galaxy/galaxy-central/wiki/Config/Cluster -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] NoneType dereference on the jobs view
Intermittently, and always during periods of high load we'll get a 500 Server error from the Admin 'Manage Jobs' list. In the logs the stacktrace looks like: http://paste.pocoo.org/show/351374/ Attached is the patch JJ provided to work around jobs without histories, but I thought I'd bring it up here too in case either others are seeing it or someone knows a root cause. Thanks! -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu # HG changeset patch # User JJ # Date 1299791693 21600 # Node ID c9e807155b6d4c43e609e7bbae42060f8dc32fa0 # Parent c8c7eb5ec4200201c66a833abc3aa2c03e8e Check for NoneType history. Tried to look at the jobs listing for galaxy and got a server error: Error - : 'NoneType' object has no attribute 'user' URL: https://galaxy.msi.umn.edu/admin/jobs ... File '/website/galaxy.msi.umn.edu/PRODUCTION/database/compiled_templates/admin/jobs.mako.p y', line 84 in render_body if job.history.user: AttributeError: 'NoneType' object has no attribute 'user' Evidently, there is a job without an associated history. So, I added a check for job.history: diff -r c8c7eb5ec420 -r c9e807155b6d templates/admin/jobs.mako --- a/templates/admin/jobs.mako Fri Feb 18 14:53:35 2011 -0500 +++ b/templates/admin/jobs.mako Thu Mar 10 15:14:53 2011 -0600 @@ -47,7 +47,7 @@ %endif ${job.id} -%if job.history.user: +%if job.history and job.history.user: ${job.history.user.email} %else: anonymous ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Galaxy job runner queue management questions
As use of our Galaxy installation is picking up, we're getting a lot of requests for greater fairness and transparency in the Galaxy job runner area. As I understand things the primary tool Galaxy gives us to affect processing order and wait times with our torque-based setup is the ability to map specific tools to varying queues or to keep them on a local-runner. On one end of the spectrum I could see a simple division of small/fast/light jobs on local and big/heavy/slow job on a single cluster queue. On the other extreme one could set up a queue per tool and use sophisticated queue management stuff on the torque side of things to balance capacity across tools, users, expected processing time, etc. How are other sites handling this? -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] divide fq into 2
On Tue, Mar 08, 2011 at 11:15:44PM -0500, Musa A. Hassan wrote: > Yes I can't get the file into galaxy at all. Am uploading from a file > path. the file is 35mb. When you say "uploading from a filepath" are you using the administrator-only functionality explained here: https://bitbucket.org/galaxy/galaxy-central/wiki/DataLibraries/UploadingFiles _not_ the 'Get Data' -> 'Upload File' selection from the 'Tools' menu? People work with files _much_ larger than that in Galaxy all the time. > Musa > > From: Ry4an Brase [ry4an+gal...@msi.umn.edu] > Sent: Tuesday, March 08, 2011 11:06 PM > To: Musa A. Hassan > Subject: Re: [galaxy-dev] divide fq into 2 > > On Tue, Mar 08, 2011 at 10:44:51PM -0500, Musa A. Hassan wrote: > > Hi Ry4an, > > > > I'd like to do this in galaxy, but the problem is it wont load into > > galaxy. As for using split, the file generated from this returns a > > length mismatch in say Tophat, maybe in the process of splitting the > > file some changes happen to the format. > > So you can't get the file into galaxy at all? Are you trying to upload > it through your browser (suitable only for non-huge files) or are you > using 'upload from file path'? How big (bytes) is the file. > > Also, you should try to keep your replies on the mailing list so that > other searching in the future find the same help. > > -- > Ry4an Brase 612-626-6575 > Software Developer Application Development > University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Pass files as url
On Tue, Mar 08, 2011 at 03:46:13PM -0500, Bonci, Timothy Daniel wrote: > I have an interactive applet that visualizes data produced by Galaxy. > The applet, being run client side, does not have access to the file > system of the server, so passing history files by reference (path) > won't work. All the files are available directly to the browser > through a url, but I can't figure out a way to get that url. > Alternatively, If anyone could help me find the history.htm file > internally I could have the program that preps the applet parse that > for the urls. Tom, there should be an easier way for you to get the URLs. If you configure your applet as an ExternalDisplayApplication: https://bitbucket.org/galaxy/galaxy-central/wiki/ExternalDisplayApplications/Tutorial you can define the tool using XML get a public URL to the data file passed directly to the URL invoking your display/applet. -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] divide fq into 2
On Tue, Mar 08, 2011 at 03:43:20PM -0500, Musa A. Hassan wrote: > Hi All, > Is there anyone out there who know how I can divide an fq file containing > illumina short reads randomly into 2 small files contaning approximately > equal number of reads? I have a huge fq from the illumina high-seq platform, > unfortunately, this file is huge and is causing all sorts of problems and I'd > like to divide into to equal sizes(based on number of reads). I'm assuming you mean "in galaxy", right? If so, check out the entries in 'Text Manipulation'. Using 'select first' and 'select last' you can turn one dataset into two datasets each half the size. If instead you mean on the Unix command line, use the tool 'split' or 'head' and 'tail'. -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] upload large data file
On Fri, Mar 04, 2011 at 04:28:20PM +, Yanji Xu wrote: > Dear Sir/Madam, > > I installed galaxy in my local server, then I tried to upload a 4.7 Gb > fastq file into galaxy, but failed. Below is the error message. > > OverflowError: signed integer is greater than maximum > > How could I upload large data files into galaxy and process the data? Use either the Upload from filepath mechanism available for data libraries ( https://bitbucket.org/galaxy/galaxy-central/wiki/DataLibraries/UploadingFiles) which has you copy the file to the server in advance and then import it, or setup the Upload via FTP functionality ( https://bitbucket.org/galaxy/galaxy-central/wiki/UploadViaFTP ). -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Amazon cloud formation for galaxy
Today's release of amazon cloud formation has to make the Galaxy Cloud stuff a bit easier: http://aws.typepad.com/aws/2011/02/cloudformation-create-your-aws-stack-from-a-recipe.html In theory with a description file like this: https://s3.amazonaws.com/cloudformation-templates/CloudFormationSample_WordPress.template Once can define an entirely cluster to bring up from custom .amis with a single click. Exciting. -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Configuring remote galaxy job runners
On Wed, Feb 23, 2011 at 10:58:18PM -0500, Nate Coraor wrote: > It looks like your datatypes_conf.xml is out of date. Have a look at > the differences from datatypes_conf.xml.sample. Ooof, I've been merging in regularly, but in a strictly additive fashion. I think my disconnect came in not understanding why those warnings would abort for a remote runner but not for the local runner, which I now realize is because the local runner isn't re-initializing the entire galaxy stack and re-parsing the config file, so on local I see those warnings at startup time where they're not blocking a job. > > Is it possible it's just the output on STDERR causing the job to fail, > > and if so how do I shut that up when I'm running through qsub (so > > redirect to /dev/null isn't quite right)? > > Yes, anything output to STDERR will be considered a failure. There is a > ticket in Bitbucket for this (actually, you commented on it 8 months ago ;) > > > https://bitbucket.org/galaxy/galaxy-central/issue/325/allow-tool-authors-to-decide-whether-to Heh, not just commented but had the call to do a 'take' on it, though the good folks who have done more with it since have taken it back. Thanks! -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Configuring remote galaxy job runners
I'm working on getting more of our jobs offloaded to other machines, and I'm getting job failures I'm not able to debug. When running a simple tool like 'cut', submitted over qsub, I'm getting STDERR output like this: WARNING:galaxy.datatypes.registry:Error loading datatype "binseq.zip", problem: 'module' object has no attribute 'Binseq' WARNING:galaxy.datatypes.registry:Error loading datatype "fastqc", problem: 'module' object has no attribute 'fastqc' WARNING:galaxy.datatypes.registry:Error loading datatype "ssaha2_index", problem: 'module' object has no attribute 'SSAHA2Index' and nothing on STDOUT. I'm doing the "Unified Method" as described here https://bitbucket.org/galaxy/galaxy-central/wiki/Config/Cluster with paths to datafiles and executables the same on the web runner and torque worker systems. I can successfully qsub trivial jobs ("ls") from the web runner machine and see them executed remotely. The web runner's galaxy log doesn't show anything out of the norm: galaxy.jobs INFO 2011-02-23 16:23:55,400 JobWrapper prepare 4019 Cut1 Ry4an perl /website/galaxy.msi.umn.edu/PRODUCTION/tools/filters/cutWrapper.pl /galaxy/PRODUCTION/database/files/019/dataset_19366.dat "c1,c2" T /galaxy/PRODUCTION/database/files/019/dataset_19823.dat galaxy.jobs INFO 2011-02-23 16:24:04,559 JobWrapper state 4019 Cut1 running Ry4an 128.101.189.29 - - [23/Feb/2011:16:24:06 -0500] "POST /root/history_item_updates HTTP/1.1" 200 - "https://galaxy.msi.umn.edu/history"; "Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.84 Safari/534.13" galaxy.jobs INFO 2011-02-23 16:24:07,771 JobWrapper finish 4019 Cut1 error Ry4an galaxy.jobs INFO 2011-02-23 16:24:07,880 JobWrapper done4019 Cut1 error Ry4an I didn't see it in the Cluster config but I added the /lib to the $PYTHONPATH just in case, but no luck. Is it possible it's just the output on STDERR causing the job to fail, and if so how do I shut that up when I'm running through qsub (so redirect to /dev/null isn't quite right)? Thanks, -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] histogram tool not working
On Tue, Feb 01, 2011 at 08:51:25PM +, Peter wrote: > On Tue, Feb 1, 2011 at 8:06 PM, David Hoover wrote: > > I just updated to the most recent version of Galaxy (hg pull -u), and now > > the error is different: > > An error occurred running this job: Error in hist.default(list(8, 6, 14, 8, > > 10, 3, 8, 6, 3, 12, 12, 8, 8, : > > 'x' must be numeric > > What gives? > > David > > Hi, > > R gives the error message 'x' must be numeric, so relevant questions > are what version of R do you have, what version or rpy, and what > version of Python (since Galaxy's tools tends to invoke R from > Python via the ryp library - certainly the histogram tool does). We had the same problem when we first installed Galaxy back in March: http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-March/002199.html Looking through both our ticketing system and our local hg commits I'm not finding what JJ did to fix it, so I'm cc-ing him on this. JJ, do you recall? (Lately similar problems, which all seem to be rpy incorrectly inferring the type of its arguments, have caused us to rewrite many of the rpy using tools to use rpy2 which the rpy developers say does a lot less "guessing", but again JJ's driving the development on that and I don't know where it's at). And everyone says "what gives?". :) -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Re: [galaxy-dev] Changeset discrepancies
On Mon, Jan 24, 2011 at 12:02:00PM -0500, lenta...@jimmy.harvard.edu wrote: > Hi Galaxy Team, > > I noticed that the main galaxy site is up to changeset 4919, but the > Mecurial repository (http://www.bx.psu.edu/hg/galaxy) is only up to > changeset 4640. Why is there a discrepancy? Did the repository move? The http://www.bx.psu.edu/hg/galaxy is an alias for https://bitbucket.org/galaxy/galaxy-dist which is the 'release' repository. The work-in-progress repository is https://bitbucket.org/galaxy/galaxy-central which has 4920 changesets. The galaxy site itself is usually somewhere between the two. -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
[galaxy-dev] bowtie map to BED conversions
I've got a user request for a converter from bowtie's map output to BED format, and looking at the provided script it's mostly just an application of cut(1) and sort(1). Is this something Galaxy already does through some mechanism we're not finding or is this 3 line conversion script something I should be adding and submitting back? Thanks, -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Re: [galaxy-dev] genetrack behind https
On Mon, Jan 03, 2011 at 09:49:59PM -0500, Daniel Blankenberg wrote: > Hi Ry4an, > > You are correct that Galaxy's GeneTrack integration requires running a > local instance of GeneTrack. In the case of GeneTrack, files are > accessed through a shared file system. I've added a note to clarify > this in the wiki. Thanks for the clarification. I should've figured it out from the shared file access, but the UCSC viewers went in so easily I got overly optimistic. ;) -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
[galaxy-dev] genetrack behind https
I'm trying to get genetrack working for bed data and when clicking on the 'Genetrack' link for a bed format dataset I get a 500 Internal Server Error django exception from http://genetrack.g2.bx.psu.edu saying 'Unable to validate key!'. Example on our staging server: http://genetrack.g2.bx.psu.edu/galaxy?filename=2f70726f6a6563742f67616c6178792d646174612f66696c65732f3030322f646174617365745f323639322e646174&hashkey=6005bb6978f963d1df79a20a92a3c2f144dbe1ff&input=458&GALAXY_URL=http://dbw-galaxy.msi.umn.edu/tool_runner%3Ftool_id%3Dpredict2genetrack The GALAXY_URL I'm sending it decodes to: http://dbw-galaxy.msi.umn.edu/tool_runner?tool_id=predict2genetrack which redirects (302) to: https://dbw-galaxy.msi.umn.edu/tool_runner?tool_id=predict2genetrack However, I don't see a request for either in the Apache log. Can the genetrack.g2.bx.psu.edu server be used for other galaxy installations as can the UCSC visualizer and I'm running afoul of my redirect and/or https setup, or should I have figured out that I need my own genetrack server from https://bitbucket.org/galaxy/galaxy-central/wiki/ExternalDisplayApplications/Tutorial ? Thanks, -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev