Re: [galaxy-dev] Unnamed histories proliferating, can't get to my data
OK, we've made some progress. When we try to do uploads (or, I think, switch histories from the User-Saved Histories page), we get an error from get_history in /www/galaxy.hms.harvard.edu/support/galaxy-dist/lib/galaxy/web/framework/__ init__.py The code that makes it is: # Perhaps a bot is running a tool without having logged in to get a history log.debug( Error: this request returned None from get_history(): %s % self.request.browser_url ) So either self.galaxy_session.current_history is failing to return anything, or get_history is being called without create=False at the wrong time. Does this help narrow down what might be happening? It's not the FTP upload; this issue can happen with uploading through the browser, too. Thanks, -Amir Karger On 10/23/12 2:42 PM, Karger, Amir amir_kar...@hms.harvard.edu wrote: I'm using Galaxy from June, 2012. (Sorry if there's already a fix.) We've got it working in production. We've gotten whole pipelines to run. However, we occasionally get situations where we upload file (using the FTP mechanism), which seems to be fine, but then I can't get to the data. I went to Saved Histories, and selected Switch, and it outlined the line in blue and wrote current history next to it. But the right pane still shows Unnamed history with no data in it. Then if I go back to Saved Histories, I get one or two new Unnamed histories, created within the last few minutes. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Resend: Unnamed histories proliferating, can't get to my data
Hi. Resending because I got no response. Can anybody suggest anything that might explain this, or tell me how I can troubleshoot? Where to look in the Python code? Whether anybody has seen anything like this? Our beta tester can't actually test anything. This occurs whether he does the FTP-style upload or uploads through the browser. Thanks, -Amir Karger On 10/23/12 2:42 PM, Karger, Amir amir_kar...@hms.harvard.edu wrote: I'm using Galaxy from June, 2012. (Sorry if there's already a fix.) We've got it working in production. We've gotten whole pipelines to run. However, we occasionally get situations where we upload file (using the FTP mechanism), which seems to be fine, but then I can't get to the data. I went to Saved Histories, and selected Switch, and it outlined the line in blue and wrote current history next to it. But the right pane still shows Unnamed history with no data in it. Then if I go back to Saved Histories, I get one or two new Unnamed histories, created within the last few minutes. I just tried to View the history, which worked (in the middle pane) and clicked import and start using history. This seemed to work, but I got three panes inside the middle pane! When I go back (again) to saved histories, there are 3 histories - one the imported one with 2 steps, two unnamed histories, all created 1 minute ago. We just asked a beta tester to play with things, and he uploaded two fastqs, but had what sounds like a similar problem. Any thoughts on what's happening? Thanks, -Amir Karger Research Computing Harvard Medical School ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Track jobs in database should be True? Re: Shell script to start Galaxy in multi-server environment
On 8/16/12 4:18 PM, Karger, Amir amir_kar...@hms.harvard.edu wrote: On 8/8/12 4:06 PM, Nate Coraor n...@bx.psu.edu wrote: If you aren't setting job_manager and job_handlers in your config, each server will consider itself the manager and handler. If not configured to run jobs, this may result in jobs failing to run. I'd suggest explicitly defining a manager and handlers. --nate Sigh. We have both job_manager and job_handlers set to the same server. It seems like our runner app may be getting into some kind of sleeping state. I was unable to upload a file, which had worked before. However, when I restarted the runner, it picked up the upload job and successfully uploaded it AND picked up the previously queued tab2fasta job, and I believe completed it successfully too. Replying to myself. The reason the runner was in a sleep state is the logic in lib/galaxy/web/config.py says: if ( len( self.job_handlers ) == 1 ) and ( self.job_handlers[0] == self.server_name ) and ( self.job_manager == self.server_name ): self.track_jobs_in_database = False For our dev instance, we have a single server acting as the job manager and the job handler, and we have two web servers also running on the dev box. So Galaxy apparently decides not to track the jobs in the database. However, this means it never finds any jobs to run. When we explicitly set self.track_jobs_in_database to be true in config.py, Galaxy correctly finds and runs jobs. I guess the webapps think that Galaxy *is* tracking jobs in the database, so they put jobs in there that never get pulled out? Or should it actually work when track_jobs_in_database is false, as long as the job manager and job handler(and webapps?) are on the same server. In that case, do we know why it didn't work? I'm happy to be running track_jobs_in_database=True, because our prod server is going to have separate machines doing web vs. job handling/managing. Thanks, -Amir ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Shell script to start Galaxy in multi-server environment
On 8/8/12 4:06 PM, Nate Coraor n...@bx.psu.edu wrote: On Aug 8, 2012, at 2:30 PM, Karger, Amir wrote: Meanwhile, we're able to restart, and get happy log messages from the jobrunner and two web servers (two servers running on different ports of a Tomcat host). And I can do an upload, which runs locally. But when I try to do a blast, which is supposed to submit to the cluster (and ran just fine on our old install), it hangs and never starts. I would think the database is working OK, since it shows me new history items when I upload and stuff. The web Galaxy log shows that I went to the tool page, and then has a ton of loads to root/history_item_updates, but nothing else. The job handler Galaxy log has nothing since the PID messages when the server started up most recently. A quick search of the archives didn't find anything obvious. (I don't have any obvious words to search for.) Any thoughts about where I should start looking to track this down? Hi Amir, If you aren't setting job_manager and job_handlers in your config, each server will consider itself the manager and handler. If not configured to run jobs, this may result in jobs failing to run. I'd suggest explicitly defining a manager and handlers. --nate Sigh. We have both job_manager and job_handlers set to the same server. It seems like our runner app may be getting into some kind of sleeping state. I was unable to upload a file, which had worked before. However, when I restarted the runner, it picked up the upload job and successfully uploaded it AND picked up the previously queued tab2fasta job, and I believe completed it successfully too. (There's an error due to a missing filetype, which I guess makes stderr non-empty and makes Galaxy think it was unsuccessful. But I can confirm that the job was in fact run on our cluster.) Running paster.py ... --status claims that the process is still running. So what would make the runner go to sleep like that and how do I stop it from happening? Thanks, -Amir ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Shell script to start Galaxy in multi-server environment
From: Nate Coraor [mailto:n...@bx.psu.edu] On Aug 2, 2012, at 2:56 PM, Karger, Amir wrote: We're upgrading to a late June Galaxy from a last-year Galaxy. We noticed that the docs say you no longer need 2 different .ini files. Great! Unfortunately, the multiprocess.sh in contrib/ still assumes you have multiple .ini files. multiprocess.sh is out of date, so I've removed it from galaxy-central. run.sh can start and stop all of your processes now, as described at: http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Web%20Application %20Scaling Thanks. Of course, reading some other people's posts and the wiki, it looks like it's not *required* to merge, just recommended. Which means our existing system of running the different scripts on different hosts should continue to work. We figure we can put off the merge thing for a bit. Meanwhile, we're able to restart, and get happy log messages from the jobrunner and two web servers (two servers running on different ports of a Tomcat host). And I can do an upload, which runs locally. But when I try to do a blast, which is supposed to submit to the cluster (and ran just fine on our old install), it hangs and never starts. I would think the database is working OK, since it shows me new history items when I upload and stuff. The web Galaxy log shows that I went to the tool page, and then has a ton of loads to root/history_item_updates, but nothing else. The job handler Galaxy log has nothing since the PID messages when the server started up most recently. A quick search of the archives didn't find anything obvious. (I don't have any obvious words to search for.) Any thoughts about where I should start looking to track this down? Thanks, -Amir ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Shell script to start Galaxy in multi-server environment
We're upgrading to a late June Galaxy from a last-year Galaxy. We noticed that the docs say you no longer need 2 different .ini files. Great! Unfortunately, the multiprocess.sh in contrib/ still assumes you have multiple .ini files. So the question is, assuming we correctly set up the different web servers, job managers, and job handlers servers in universe_wsgi.ini, what's the command line we should be giving to run Galaxy on each type of server? The wiki Admin/Config pages for Performance and Scaling and Cluster and that sort of thing had some info on editing the .ini, but I didn't see what my .sh should look like there. Pointers to websites, emails, or existing .sh files appreciated. Thanks, -Amir Karger Senior Research Computing Consultant Harvard Medical School Research Computing amir_kar...@hms.harvard.edu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] How to upload local files in Galaxy
From: Fields, Christopher J [mailto:cjfie...@illinois.edu] [SNIP] if you follow the guidelines for FTP import, any method used (not just FTP, but scp, sftp, grid-ftp, etc.) to get data into the 'FTP' import folder works as long as permissions on the data are set so the galaxy user on the cluster end can read the data. From our investigations, the galaxy user needs to *own* the file, not just be able to read it. The reason is that galaxy does a shutil.move, which calls copy2 to give the new file the same permissions as the old, which calls copystat, which calls os.chmod, which requires galaxy to be the owner. This seems to be true at least on our old version of Galaxy, though perhaps it's been fixed in the last few months. We had our local cluster admins set up a link to the user's galaxy import folder in their home directory, so users can basically do this: scp mydata.fastq.gz usern...@biocluster.igb.illinois.edu:galaxy-upload We wanted to do just that, but we kept getting permission errors. In the end, we needed to add a cron job that runs every minute as root and does a chown galaxy on any files appearing in the upload directories. Please let me know if I'm missing something. I'd be happy to turn off the cron. Thanks, -Amir Karger ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/