[galaxy-dev] file upload/unzip issues

2012-12-08 Thread Michael Moore
Hello,

I and other members of my lab are encountering issues uploading (some) files to 
the Main public Galaxy server.  We routinely upload from our local server to 
main.g2.bx.psu.edu via ftp using Cyberduck.  In my case, on 12/6/12 I uploaded 
12 zipped fastq files.  Of these, 9 were completely fine.  For the remaining 3 
files, the transfer through Cyberduck appeared to work fine, and the files 
appeared (with the correct file size) on the upload screen under "Get data" as 
usual.  However, once the files were uploaded into a Galaxy history, they were 
empty with a message saying "Problem decompressing gzipped data."  An example 
is entry #45: '1PositiveRFP92112Pool41_ATCACG_L002_R1_001.fastq.gz' in my 
history called "Tcell_120812."  My account uses this email address as the login 
ID (mmo...@rockefeller.edu).

All of these files are pretty large (from 5 to 10 GB once unzipped), but the 
failure/success did not appear to correlate to the size of the file.  I am not 
exceeding my space quota. In addition, I was able to successfully upload the 
file listed above (1PositiveRFP92112Pool41_ATCACG_L002_R1_001.fastq.gz) to our 
local Galaxy installation, so I don't think there is anything wrong with the 
file itself.  However, it has failed to upload to the public galaxy server 
multiple times.  The experience of my lab mates is similar; some of their files 
are uploading correctly, and others are having the same problem I described 
above.  We can't seem to find anything common among the files that are failing 
to upload.  For instance, they span different sequencing runs and were 
deposited on our local server at different times

Has anyone else been encountering similar issues ?  Please let me know if I 
should direct this question elsewhere or if I can provide any further 
information.  Thanks very much for your time.

Michael Moore
Rockefeller University


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] [Errno 32] Broken pipe on uploading file

2012-05-21 Thread Michael Moore
That looks like a local network error.  It is not the other common case
where galaxy is taking over stdout and stderr in a subprocess call to a
local job and that job is a script that pipes output from one program to
another via reassigning stdout and stdin within those programs.

Is this error repeatable?  If so, then there is something to investigate.
Otherwise, we need more graceful recovery, at some point, instead of a
mystical stack trace.

Also, if you post these messages to the list, it is good practice to remove
IP addresses which can be reached via the web.  Not all of us wear white
hats.



On Mon, May 21, 2012 at 3:55 AM,  wrote:

>   Dear Galaxy Team.
>
> I installed Galaxy on company server for integrating with our services.
> And I found some problem in uploading files via API.
> I tried to upload file from local folder and it works well. But, then I
> tried to upload file from remote server it throws next exception:
>
> 194.226.177.176 - - [21/May/2012:12:38:51 +0200] "POST
> /api/user/adm/upload?key=71edcbdb2b91c9b2c1125775804d2093 HT
> TP/1.1" 500 - "-" "Python-urllib/2.6"
> Debug at: http://cogangs.biobase.de:82/_debug/view/1337592259
> 
> Exception happened during processing of request from ('194.226.177.176',
> 29682)
> Traceback (most recent call last):
>   File
> "/data/dat0/galaxy-dist-back/eggs/Paste-1.6-py2.6.egg/paste/httpserver.py",
> line 1053, in process_request_in
> _thread
> self.finish_request(request, client_address)
>   File "/usr/lib/python2.6/SocketServer.py", line 322, in finish_request
> self.RequestHandlerClass(request, client_address, self)
>   File "/usr/lib/python2.6/SocketServer.py", line 618, in __init__
> self.finish()
>   File "/usr/lib/python2.6/SocketServer.py", line 661, in finish
> self.wfile.flush()
>   File "/usr/lib/python2.6/socket.py", line 297, in flush
> self._sock.sendall(buffer(data, write_offset, buffer_size))
> error: [Errno 32] Broken pipe
> 
>
> Browsing libraries from remote server works well.
> Please help me to resolve this problem.
>
> In attach: archive with debug message, controller class for processing
> uploading, buidapp.py, wsgi config file and script for request upload file.
> Server A contains: uploading file, upload_file.py and Galaxy api
> Server B contains: Galaxy service, upload.py, buildapp.py, wsgi config file
>
> Best regards,
> Nikolay
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Spooky behavior

2012-05-04 Thread Michael Moore
OK data library, (brand new).  I upload one BAM file using galaxy's "Add
Dataset"

I start a new history, titling it, "testing multiple use". I import to
current history.  I run a job that reads the BAM file and produces some
output.

Everything works.  I have 13 output data sets in the history, all showing
green.

Now I create a second history, called "2nd use same dataset same library"
Again I check the box and click "import to current history"
Then I try to run the same job as before...

[bam_header_read] bgzf_check_EOF: Invalid argument
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from
"/.../galaxy/database/files/000/dataset_961.dat"  (Ellipsis is mine, not
Python's)

Oh my, looking in that file space, I discover that my BAM file is at
dataset_962.dat.

Was gibt?

Michael Moore
.
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] job terminating on warnings from software

2012-05-04 Thread Michael Moore
I have a long string of steps dying on the last step because GNUPLOT sends
a warning to the log which galaxy faithfully records 11 faithful times in
my history files.  The 10 other files have downloadable content, and in
fact, outside of galaxy the plot works.  It is simply changing some
intervals.

How can I tell galaxy to keep going unless I get an actual error?

Michael
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] /bin/sh: samtools: not found-->>WORKAROUND

2012-04-30 Thread Michael Moore
Yeah, I was using ['printenv| mail  && samtools'] inside the
subprocess.Popen(['samtools'],... in upload.py.

Now right before in upload.py before the subprocess was called, I had
/usr/bin/ in my PATH, but inside the subprocess the story was different.  I
just used the symbolic link to access samtools from something that remained
in the PATH.  I have no idea how this could happen, but I have noticed that
galaxy does things with input and output and seems to manipulate the
environment heavily--but why should a child process have a different
environment when none was invoked?  I'll figure that out later.  I am still
trying to convert some software to run with galaxy.

On Mon, Apr 30, 2012 at 8:57 AM, Nate Coraor  wrote:

> On Apr 24, 2012, at 8:36 PM, Michael Moore wrote:
>
> > There is apparently a persistent problem with samtools which normally
> lives at /usr/bin/samtools.  I encountered a similar problem in Python when
> uploading BAM files.
> >
> > I did not resolve the problem.  I hacked for a while on binary.py in a
> lib/ subdirectory and used os.system to send myself mail describing the
> effective path at various points, and I added a missing
> >
> > logging.basicConfig()
> >
> > statement and scattered some log.WARNING statements strategically.  All
> this told me nothing.  So I made a few symlinks to samtools.  The one that
> got things working was
> >
> > ln -s /usr/bin/samtools /home/galaxy/bin/samtools
> >
> > so--worked around but not resolved.
>
> Hi Michael,
>
> For tools that output BAM, samtools needs to be in your $PATH, or has to
> be set up via the tool dependencies system.  See the following for details:
>
>http://wiki.g2.bx.psu.edu/Admin/Config/Tool%20Dependencies
>
> For SGE, you can modify the $PATH used on the cluster in ~/.sge_request or
> the file specified in the 'environment_setup_file' galaxy config option.
>
> --nate
>
> >
> > Michael
> >
> > On Tue, Apr 17, 2012 at 12:15 PM, zhengqiu cai 
> wrote:
> > Hi All,
> >
> > I submitted a job to convert sam to bam, and the job was running forever
> without outputing the result. I then checked the log, and it read:
> > Traceback (most recent call last):
> >  File "/mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py",
> line 336, in finish_job
> >drm_job_state.job_wrapper.finish( stdout, stderr )
> >  File "/mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/__init__.py", line
> 637, in finish
> >dataset.set_meta( overwrite = False )
> >  File "/mnt/galaxyTools/galaxy-dist/lib/galaxy/model/__init__.py", line
> 875, in set_meta
> >return self.datatype.set_meta( self, **kwd )
> >  File "/mnt/galaxyTools/galaxy-dist/lib/galaxy/datatypes/binary.py",
> line 179, in set_meta
> >raise Exception, "Error Setting BAM Metadata: %s" % stderr
> > Exception: Error Setting BAM Metadata: /bin/sh: samtools: not found
> >
> > It means that the samtools is not in the PATH. I tried to set the PATH
> in a couple of methods according the Galaxy documentation:
> > 1. put the path in the env.sh in the tool directory and symbolink
> default to the tool directory, e.g. default ->
> =/mnt/galaxyTools/tools/samtools/0.1.18
> > 2. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in ~/.sge_request
> > 3. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in
> /path/sge_request
> >
> > none of them worked, and I got the above same problem.
> >
> > Then I checked the job log file in the job_working_directory, and it
> read:
> > Samtools Version: 0.1.18 (r982:295)
> > SAM file converted to BAM
> >
> > which shows that sge knows the PATH of samtools. To double check it, I
> added samtools index to Galaxy, and it worked well. I am very confused why
> SGE knows the tool path but cannot run the job correctly.
> >
> > The system I am using is ubuntu on EC2. I checked out the code from
> galaxy-dist on bitbucket. Other tools such as bwa and bowtie worked well
> using the same setting method(put env.sh in the tools directory to set the
> tool path)
> >
> > Thank you very much for any help or hints.
> >
> > Cai
> >
> > ___
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this
> > and other Galaxy lists, please use the interface at:
> >
> >  http://lists.bx.psu.edu/
> >
> > ___
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this
> > and other Galaxy lists, please use the interface at:
> >
> >  http://lists.bx.psu.edu/
>
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] library_import_dir -- How is it supposed to work?

2012-04-26 Thread Michael Moore
I set library_import_dir to a path and tried uploading a directory of bam
files.  After fixing the situation so galaxy could find samtools in that
subshell, I was able to upload links to the history.  But moving things to
one directory did not appear to be terribly useful, so I tested what
happened if I had subdirectories existing in library import directory.

Test 1 I used folders u1 and u2, each with data, and some data in the root
library_import_directory.  After clearing the samtools eror I was presented
with a drop-down list with choices 'None', 'u1' and 'u2'.  Selecting 'None'
did not result in seeking data but a sharp reminder that I had to pick a
directory.  Selecting those directories led to uploads, with the Non-Copy
correctly sized and even downloadable from the data library, but NOT usable
in the history, because upload.py decided the file did not exist (probably
the 'path' variable in os.path.exists() line 99).

It also became apparent that the upload would look one level down from the
root directory and no further (tested by adding u3 with data and a
subdirectory of u3 called v3, also with data.

So the  current state of affairs is that it is a single directory to which
one must move files in order to upload links to a galaxy library or folders
thereof..  OR, alternatively, make a directory of links called directory A,
and then another directory of links to the links in directory A at the
library_import_dir and then ask Galaxy to copy the data.  (Not fully tested
yet).

But it is apparent from the UI that more utility was intended.  If I have
time, I will help with that.

Michael Moore
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] How does one survive the updates?

2012-04-25 Thread Michael Moore
I was running a single instance of galaxy on my own machine, playing with
library_import and figuring why BAM files were getting errors on upload.
(It is a path matter, when one drops into the subshell and the workaround
is

ln -s /usr/bin/samtools /home/galaxy/bin/samtools

)

Anyway, I seemed to have it figured last eve so I shut down the notebook
where galaxy was running (on RHEL6) and went home.  This am when I started
galaxy with sh run.sh I got some messages about egg,ini and replacing
universe_wsgi. files from universe_wsgi.sample. files and nothing worked.
My registration was gone.  My admin_user was overwritten.  I re-registered,
and restored the file settings and restarted galaxy, but now no login
sticks.  It does not complain about a registered user, but it does not
appear to hold the session--the user tab does not show me as logged in.  Is
there some different way it is handling cookies?

Michael Moore
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] /bin/sh: samtools: not found-->>WORKAROUND

2012-04-24 Thread Michael Moore
There is apparently a persistent problem with samtools which normally lives
at /usr/bin/samtools.  I encountered a similar problem in Python when
uploading BAM files.

I did not resolve the problem.  I hacked for a while on binary.py in a lib/
subdirectory and used os.system to send myself mail describing the
effective path at various points, and I added a missing

logging.basicConfig()

statement and scattered some log.WARNING statements strategically.  All
this told me nothing.  So I made a few symlinks to samtools.  The one that
got things working was

ln -s /usr/bin/samtools /home/galaxy/bin/samtools

so--worked around but not resolved.

Michael

On Tue, Apr 17, 2012 at 12:15 PM, zhengqiu cai wrote:

> Hi All,
>
> I submitted a job to convert sam to bam, and the job was running forever
> without outputing the result. I then checked the log, and it read:
> Traceback (most recent call last):
>  File "/mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py",
> line 336, in finish_job
>drm_job_state.job_wrapper.finish( stdout, stderr )
>  File "/mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/__init__.py", line
> 637, in finish
>dataset.set_meta( overwrite = False )
>  File "/mnt/galaxyTools/galaxy-dist/lib/galaxy/model/__init__.py", line
> 875, in set_meta
>return self.datatype.set_meta( self, **kwd )
>  File "/mnt/galaxyTools/galaxy-dist/lib/galaxy/datatypes/binary.py", line
> 179, in set_meta
>raise Exception, "Error Setting BAM Metadata: %s" % stderr
> Exception: Error Setting BAM Metadata: /bin/sh: samtools: not found
>
> It means that the samtools is not in the PATH. I tried to set the PATH in
> a couple of methods according the Galaxy documentation:
> 1. put the path in the env.sh in the tool directory and symbolink default
> to the tool directory, e.g. default ->
> =/mnt/galaxyTools/tools/samtools/0.1.18
> 2. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in ~/.sge_request
> 3. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in /path/sge_request
>
> none of them worked, and I got the above same problem.
>
> Then I checked the job log file in the job_working_directory, and it read:
> Samtools Version: 0.1.18 (r982:295)
> SAM file converted to BAM
>
> which shows that sge knows the PATH of samtools. To double check it, I
> added samtools index to Galaxy, and it worked well. I am very confused why
> SGE knows the tool path but cannot run the job correctly.
>
> The system I am using is ubuntu on EC2. I checked out the code from
> galaxy-dist on bitbucket. Other tools such as bwa and bowtie worked well
> using the same setting method(put env.sh in the tools directory to set the
> tool path)
>
> Thank you very much for any help or hints.
>
> Cai
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Problem with cleaning up galaxy datasets

2012-04-13 Thread Michael Moore
Yes, you do not have a url like http://some_path or ssh://some_path
defined in your config file.  Look at the command-line options for
your routines and figure out how it is finding your config file.  It
could be looking in the wrong place or you could have an error in the
config file (missing piece or undefined variable which would default
to False (a boolean).

On 4/11/12, Klaus Metzeler  wrote:
>
> Dear all,
>
> I have a problem with the cleanup scripts on my local galaxy instance.
> I am using the updated cleanup_datasets.py, as per Nate's earlier reply
> here
> http://gmod.827538.n3.nabble.com/Problem-running-purge-datasets-sh-cleanup-scripts-td3688016.html#none.
> 
>
> This is the output I get when running cleanup_datasets.py:
>
>   ~/ngs-bin/galaxy-dist $ scripts/cleanup_datasets/cleanup_datasets.py
> -d 2 -6 -r
> Traceback (most recent call last):
>File "scripts/cleanup_datasets/cleanup_datasets.py", line 524, in
> 
>  if __name__ == "__main__": main()
>File "scripts/cleanup_datasets/cleanup_datasets.py", line 82, in main
>  ini_file = args[0]
> IndexError: list index out of range
>
> ... and this if I call it via the shell script cleanup_datasets.sh
>
> ~/ngs-bin/galaxy-dist $ scripts/cleanup_datasets/delete_datasets.sh
> Traceback (most recent call last):
>File "./scripts/cleanup_datasets/cleanup_datasets.py", line 524, in
> 
>  if __name__ == "__main__": main()
>File "./scripts/cleanup_datasets/cleanup_datasets.py", line 101, in main
>  app = CleanupDatasetsApplication( config )
>File "./scripts/cleanup_datasets/cleanup_datasets.py", line 512, in
> __init__
>  self.model = galaxy.model.mapping.init( config.file_path,
> config.database_connection, engine_options={}, create_tables=False,
> object_store=self.object_store )
>File
> "/home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py",
> line 1818, in init
>  load_egg_for_url( url )
>File
> "/home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py",
> line 1798, in load_egg_for_url
>  dialect = guess_dialect_for_url( url )
>File
> "/home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py",
> line 1794, in guess_dialect_for_url
>  return (url.split(':', 1))[0]
> AttributeError: 'bool' object has no attribute 'split'
>
>
> Any idea what might be wrong?
> Thanks a lot for your support,
> Klaus
>
>
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Toolshed tribulations--certain types seem unsupported

2012-04-09 Thread Michael Moore
No joy.  I cannot even find acknowledgement that galaxy is loading any of
my tools or of those in the directory that follows.  The tool is called
analyze_reads and "cat paster.log| grep analyze_r| less produces an end of
file.  The last tool load showing is vcf_tools_extract, but all beyond that
also load and are usable in workflows EXCEPT analyze_reads when any
parameter is specified type="float" or type="integer".  The tools I am
using to edit the files are set for UTF-8 encoding but are not in fact
capable of sending anything but plain old ASCII.  (the very basic vi as my
X isn't working at the moment).

Still examining paster.log was an education.  I am pleased that it got by
many reported errors and still picked up my scan_reads tool.

I guess my next act is to run siege and see if all the instances are
actually responding

Thank you much for the suggestion.  I really needed to see paster.log.

On Mon, Apr 9, 2012 at 3:28 PM, Ross  wrote:

> Michael,
>
> First thing I'd do would be to check paster.log after trying to load
> your new tool and I'll bet there's an 8 bit character or something in
> your text somewhere that causes the tool parser to barf - the error in
> paster.log will tell you the line and character where loading stopped?
>
> On Mon, Apr 9, 2012 at 6:24 PM, Michael Moore
>  wrote:
> > I am testing galaxy for wide use, and I have legacy text files that call
> > algorithms, sorts, and displays.  One such has a tool.xml file with 10
> > parameters, one select, three integer, and one float, with the rest text
> for
> > the moment.  (some will be type="data" later if the runs equate to the
> runs
> > we do outside galaxy)
> >
> > The tool does not show up.  Firefox, emacs and vim all agree that it is
> > well-formed, and galaxy has been properly bounced.  I experimented with
> > removing parameters and found with 7 parameters, I did not have the
> problem,
> > then I noticed that all of them had been changed to text or select in my
> > desperation to make it show up for placement on the workflow.  I
> returned to
> > 10 parameters, but this time all type="select" and type="text", and
> > everything worked.  But slipping even one integer, even with the
> (optional)
> > min max and default tags, and the tool would disappear on restart.
> >
> > Am I looking at a bug, or is there something I need to be doing to make
> this
> > tool visible with numeric parameters?
> >
> > MGM
> >
> >
> > ___
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this
> > and other Galaxy lists, please use the interface at:
> >
> >  http://lists.bx.psu.edu/
>
>
>
> --
> Ross Lazarus MBBS MPH;
> Associate Professor, Harvard Medical School;
> Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Toolshed tribulations--certain types seem unsupported

2012-04-09 Thread Michael Moore
I am testing galaxy for wide use, and I have legacy text files that call
algorithms, sorts, and displays.  One such has a tool.xml file with 10
parameters, one select, three integer, and one float, with the rest text
for the moment.  (some will be type="data" later if the runs equate to the
runs we do outside galaxy)

The tool does not show up.  Firefox, emacs and vim all agree that it is
well-formed, and galaxy has been properly bounced.  I experimented with
removing parameters and found with 7 parameters, I did not have the
problem, then I noticed that all of them had been changed to text or select
in my desperation to make it show up for placement on the workflow.  I
returned to 10 parameters, but this time all type="select" and type="text",
and everything worked.  But slipping even one integer, even with the
(optional) min max and default tags, and the tool would disappear on
restart.

Am I looking at a bug, or is there something I need to be doing to make
this tool visible with numeric parameters?

MGM
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/