Re: [galaxy-dev] Unnamed histories proliferating, can't get to my data

2012-11-02 Thread Karger, Amir
OK, we've made some progress. When we try to do uploads (or, I think,
switch histories from the User-Saved Histories page), we get an error
from get_history in
/www/galaxy.hms.harvard.edu/support/galaxy-dist/lib/galaxy/web/framework/__
init__.py


The code that makes it is:

# Perhaps a bot is running a tool without having logged in to get a
history 
log.debug( Error: this request returned None from
get_history(): %s % self.request.browser_url )


So either self.galaxy_session.current_history is failing to return
anything, or get_history is being called without create=False at the wrong
time.

Does this help narrow down what might be happening? It's not the FTP
upload; this issue can happen with uploading through the browser, too.

Thanks,

-Amir Karger

On 10/23/12 2:42 PM, Karger, Amir amir_kar...@hms.harvard.edu wrote:

I'm using Galaxy from June, 2012. (Sorry if there's already a fix.)

We've got it working in production. We've gotten whole pipelines to
run.
However, we occasionally get situations where we upload file (using the
FTP mechanism), which seems to be fine, but then I can't get to the data.
I went to Saved Histories, and selected Switch, and it outlined the line
in blue and wrote current history next to it. But the right pane still
shows Unnamed history with no data in it. Then if I go back to Saved
Histories, I get one or two new Unnamed histories, created within the
last
few minutes.





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Resend: Unnamed histories proliferating, can't get to my data

2012-10-31 Thread Karger, Amir
Hi. Resending because I got no response. Can anybody suggest anything that
might explain this, or tell me how I can troubleshoot? Where to look in
the Python code? Whether anybody has seen anything like this? Our beta
tester can't actually test anything. This occurs whether he does the
FTP-style upload or uploads through the browser.

Thanks,

-Amir Karger

On 10/23/12 2:42 PM, Karger, Amir amir_kar...@hms.harvard.edu wrote:

I'm using Galaxy from June, 2012. (Sorry if there's already a fix.)

We've got it working in production. We've gotten whole pipelines to run.
However, we occasionally get situations where we upload file (using the
FTP mechanism), which seems to be fine, but then I can't get to the data.
I went to Saved Histories, and selected Switch, and it outlined the line
in blue and wrote current history next to it. But the right pane still
shows Unnamed history with no data in it. Then if I go back to Saved
Histories, I get one or two new Unnamed histories, created within the last
few minutes.

I just tried to View the history, which worked (in the middle pane) and
clicked import and start using history. This seemed to work, but I got
three panes inside the middle pane! When I go back (again) to saved
histories, there are 3 histories - one the imported one with 2 steps, two
unnamed histories, all created  1 minute ago.

We just asked a beta tester to play with things, and he uploaded two
fastqs, but had what sounds like a similar problem.

Any thoughts on what's happening?

Thanks,

-Amir Karger
Research Computing
Harvard Medical School



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Track jobs in database should be True? Re: Shell script to start Galaxy in multi-server environment

2012-08-17 Thread Karger, Amir
On 8/16/12 4:18 PM, Karger, Amir amir_kar...@hms.harvard.edu wrote:

On 8/8/12 4:06 PM, Nate Coraor n...@bx.psu.edu wrote:

If you aren't setting job_manager and job_handlers in your config, each
server will consider itself the manager and handler.  If not configured
to run jobs, this may result in jobs failing to run.  I'd suggest
explicitly defining a manager and handlers.

--nate

Sigh. We have both job_manager and job_handlers set to the same server.

It seems like our runner app may be getting into some kind of sleeping
state. I was unable to upload a file, which had worked before. However,
when I restarted the runner, it picked up the upload job and successfully
uploaded it AND picked up the previously queued tab2fasta job, and I
believe completed it successfully too.

Replying to myself.

The reason the runner was in a sleep state is the logic in
lib/galaxy/web/config.py says:

if ( len( self.job_handlers ) == 1 ) and ( self.job_handlers[0] ==
self.server_name ) and ( self.job_manager == self.server_name ):
self.track_jobs_in_database = False


For our dev instance, we have a single server acting as the job manager
and the job handler, and we have two web servers also running on the dev
box. So Galaxy apparently decides not to track the jobs in the database.
However, this means it never finds any jobs to run. When we explicitly set
self.track_jobs_in_database to be true in config.py, Galaxy correctly
finds and runs jobs.

I guess the webapps think that Galaxy *is* tracking jobs in the database,
so they put jobs in there that never get pulled out? Or should it actually
work when track_jobs_in_database is false, as long as the job manager and
job handler(and webapps?) are on the same server. In that case, do we know
why it didn't work? I'm happy to be running track_jobs_in_database=True,
because our prod server is going to have separate machines doing web vs.
job handling/managing.

Thanks,

-Amir


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Shell script to start Galaxy in multi-server environment

2012-08-16 Thread Karger, Amir


On 8/8/12 4:06 PM, Nate Coraor n...@bx.psu.edu wrote:

On Aug 8, 2012, at 2:30 PM, Karger, Amir wrote:


 
 Meanwhile, we're able to restart, and get happy log messages from the
jobrunner and two web servers (two servers running on different ports
of a Tomcat host). And I can do an upload, which runs locally. But when
I try to do a blast, which is supposed to submit to the cluster (and ran
just fine on our old install), it hangs and never starts. I would think
the database is working OK, since it shows me new history items when I
upload and stuff. The web Galaxy log shows that I went to the tool page,
and then has a ton of loads to root/history_item_updates, but nothing
else. The job handler Galaxy log has nothing since the PID messages when
the server started up most recently.
 
 A quick search of the archives didn't find anything obvious. (I don't
have any obvious words to search for.) Any thoughts about where I should
start looking to track this down?

Hi Amir,

If you aren't setting job_manager and job_handlers in your config, each
server will consider itself the manager and handler.  If not configured
to run jobs, this may result in jobs failing to run.  I'd suggest
explicitly defining a manager and handlers.

--nate

Sigh. We have both job_manager and job_handlers set to the same server.

It seems like our runner app may be getting into some kind of sleeping
state. I was unable to upload a file, which had worked before. However,
when I restarted the runner, it picked up the upload job and successfully
uploaded it AND picked up the previously queued tab2fasta job, and I
believe completed it successfully too. (There's an error due to a missing
filetype, which I guess makes stderr non-empty and makes Galaxy think it
was unsuccessful. But I can confirm that the job was in fact run on our
cluster.) Running paster.py ... --status claims that the process is still
running. So what would make the runner go to sleep like that and how do
I stop it from happening?

Thanks,

-Amir


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Shell script to start Galaxy in multi-server environment

2012-08-08 Thread Karger, Amir
 From: Nate Coraor [mailto:n...@bx.psu.edu]
 On Aug 2, 2012, at 2:56 PM, Karger, Amir wrote:
 
  We're upgrading to a late June Galaxy from a last-year Galaxy. We noticed
  that the docs say you no longer need 2 different .ini files. Great!
  Unfortunately, the multiprocess.sh in contrib/ still assumes you have
  multiple .ini files.
 multiprocess.sh is out of date, so I've removed it from galaxy-central.  
 run.sh
 can start and stop all of your processes now, as described at:
 
 http://wiki.g2.bx.psu.edu/Admin/Config/Performance/Web%20Application
 %20Scaling

Thanks. Of course, reading some other people's posts and the wiki, it looks 
like it's not *required* to merge, just recommended. Which means our existing 
system of running the different scripts on different hosts should continue to 
work. We figure we can put off the merge thing for a bit.

Meanwhile, we're able to restart, and get happy log messages from the jobrunner 
and two web servers (two servers running on different ports of a Tomcat 
host). And I can do an upload, which runs locally. But when I try to do a 
blast, which is supposed to submit to the cluster (and ran just fine on our old 
install), it hangs and never starts. I would think the database is working OK, 
since it shows me new history items when I upload and stuff. The web Galaxy log 
shows that I went to the tool page, and then has a ton of loads to 
root/history_item_updates, but nothing else. The job handler Galaxy log has 
nothing since the PID messages when the server started up most recently.

A quick search of the archives didn't find anything obvious. (I don't have any 
obvious words to search for.) Any thoughts about where I should start looking 
to track this down?

Thanks,

-Amir


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Shell script to start Galaxy in multi-server environment

2012-08-02 Thread Karger, Amir
We're upgrading to a late June Galaxy from a last-year Galaxy. We noticed
that the docs say you no longer need 2 different .ini files. Great!
Unfortunately, the multiprocess.sh in contrib/ still assumes you have
multiple .ini files.

So the question is, assuming we correctly set up the different web
servers, job managers, and job handlers servers in universe_wsgi.ini,
what's the command line we should be giving to run Galaxy on each type of
server? The wiki Admin/Config pages for Performance and Scaling and
Cluster and that sort of thing had some info on editing the .ini, but I
didn't see what my .sh should look like there. Pointers to websites,
emails, or existing .sh files appreciated.

Thanks,

-Amir Karger
Senior Research Computing Consultant
Harvard Medical School Research Computing
amir_kar...@hms.harvard.edu


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] How to upload local files in Galaxy

2012-06-13 Thread Karger, Amir
 From: Fields, Christopher J [mailto:cjfie...@illinois.edu]
 [SNIP] if you follow the guidelines for FTP import, any method used
 (not just FTP, but scp, sftp, grid-ftp, etc.) to get data into the 'FTP' 
 import
 folder works as long as permissions on the data are set so the galaxy user on
 the cluster end can read the data.  

From our investigations, the galaxy user needs to *own* the file, not just be 
able to read it. The reason is that galaxy does a shutil.move, which calls 
copy2 to give the new file the same permissions as the old, which calls 
copystat, which calls os.chmod, which requires galaxy to be the owner.

This seems to be true at least on our old version of Galaxy, though perhaps 
it's been fixed in the last few months.

 We had our local cluster admins set up a
 link to the user's galaxy import folder in their home directory, so users can
 basically do this:
 
 scp mydata.fastq.gz usern...@biocluster.igb.illinois.edu:galaxy-upload

We wanted to do just that, but we kept getting permission errors. In the end, 
we needed to add a cron job that runs every minute as root and does a chown 
galaxy on any files appearing in the upload directories.

Please let me know if I'm missing something. I'd be happy to turn off the cron.

Thanks,

-Amir Karger



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/