Re: [galaxy-dev] [Nbicgalaxy-admin] RPM repository for NGS tools in Galaxy

2011-03-04 Thread David van Enckevort
Hi James,

On 2 March 2011 19:44, James Taylor ja...@jamestaylor.org wrote:

 Hi Leon,

 Thanks for sharing this with the community!

 As far as similar activities, we are actively working on a solution for
 packaging and deploying tools. Enis can share more about that, it is what we
 use already to automatically build our cloud images with all tools and data
 installed.

 Importantly, we are not using an existing package manager like RPM for
 (mostly) two reasons. First, we're trying to avoid focusing specifically on
 redhat et al. But more importantly, we want to avoid installing anything at
 the system level. In particular because it is difficult to have multiple
 versions of the same tool installed and usable at the same time. Instead, we
 are installing everything in isolated directories like:

  $GALAXY_APPS/package/version/

 And adding the appropriate information to the environment at runtime based
 on requirement tags in the tool config.


Thank you for your feedback. We'd be interesting in seeing what you do for
your cloud images.

I do think that we can achieve the same as you do with RPMl, if we design
our repositories and the rpms well.

My idea is that we would have two repositories:
1. Repository with the latest versions
2. Repository with versioned RPMs

The repository with the latest versions is used for people who just want to
always have the latest versions, the repository with the versioned RPMs is
used if you want to pin to specific versions. The versioned repository could
install with the same $GALAXY_APPS/package/version/ structure as you use.

The nice thing about have RPM packages is that you know exactly which
version is installed, and that the package management system is already
there to take care of dependencies.

We do consider to provide packages for other distributions later, but
serving our own services is what we start with, and those are Red Hat.
Actually, since RPM is the packaging format chosen in the Linux Standard
Base any compliant distribution (and all major ones are) should be able to
install RPM packages if we take care to provide the correct dependencies.
(i.e. -compat packages for things that are missing)


With kind regards,


 David van Enckevort


 On Mar 2, 2011, at 12:38 PM, Leon Mei wrote:

  Dear colleagues,
 
  In order to ease administration on our servers running at VIB and
  NBIC, we will set up an RPM Repository for bioinformatics tools, the
  primary focus being NGS tools. The purpose is to come to a stable
  repository of easily installable packages for the common
  bioinformatics tools that can be used in a local Galaxy server.
 
  A list of tools under consideration can be found at
 
 https://wiki.nbic.nl/index.php/NBIC_%26_VIB_Bioinformatics_RPM_Repository
 
  So are there already similar activities going on? If yes, we would
  really love to hear your experience and probably work together on
  this.
 
  If you would like to join this effort and contribute into this
  repository, you are more than welcome to contact us as well!
 
  Thanks,
  Leon
 
  --
  Hailiang (Leon) Mei
  Netherlands Bioinformatics Center (http://www.nbic.nl/)
  Skype: leon_meiMobile: +31 6 41709231
 
  ___
  To manage your subscriptions to this and other Galaxy lists, please use
 the interface at:
 
   http://lists.bx.psu.edu/

 -- jt

 James Taylor, Assistant Professor, Biology / Computer Science, Emory
 University




 ___
 Nbicgalaxy-admin mailing list
 nbicgalaxy-ad...@trac.nbic.nl
 https://trac.nbic.nl/mailman/listinfo/nbicgalaxy-admin




-- 
David van Enckevort
Project Leader biobanking taskforce
Software Integration Engineer
BioAssist
mob: +31 6 543 32 276
tel: +31 24 36 19 500
fax: +31 24 89 01 798
E-mail:  david.van.enckev...@nbic.nl
Skype: enckevort76
Netherlands Bioinformatics Centre

260 NBIC
P.O. Box 9101
6500 HB Nijmegen
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Galaxy Velvet error: Unknown option -ins_length3

2011-03-04 Thread graham etherington (JIC)
Hi,
I've downloaded the Suite of Velvet assembler tools at 
http://community.g2.bx.psu.edu/
and installed them as detailed in 
http://gmod.827538.n3.nabble.com/attachment/868065/0/README?by-user=t
I've then run velveth in Galaxy using two files - a file each of corresponding 
left and right paired-end reads which gives me the following output:
0.01] Reading FastQ file 
/opt/galaxy_dist/database/files/002/dataset_2411.dat
[0.003878] 1538 reads found.
[0.003881] Done
[0.003889] Reading FastQ file 
/opt/galaxy_dist/database/files/002/dataset_2412.dat
[0.007500] 1538 reads found.
[0.007502] Done
[0.007533] Reading read set file 
/opt/galaxy_dist/database/files/002/dataset_2456_files/Sequences;
[0.008571] 3076 sequences found
[0.018786] Done
[0.018791] 3076 sequences in total.
[0.018825] Writing into roadmap file 
/opt/galaxy_dist/database/files/002/dataset_2456_files/Roadmaps...
[0.021716] Inputting sequences...
[0.021719] Inputting sequence 0 / 3076
[0.169007] Done inputting sequences
[0.169017] Destroying splay table
[0.177049] Splay table destroyed

I then go to the velvetg tool and select my velveth output file as the input to 
velvetg, use 'auto' as the -ins_length and -ins_length_sd, and then leave the 
remaining ins_length fields blank. I leave the final values (from -exp_cov to  
Minimum Read-Pair Validation) as the default values.
When I execute the job it runs and finishes almost instantly and the only 
output for 'velvetg on data X' is:
[0.01] Unknown option: -ins_length3

The output files Contigs, Contig Stats and Unused Reads are all empty and the 
LastGraph file has the error:
ERROR: /opt/galaxy_dist/database/files/002/dataset_2457_files/LastGraph not 
found!

The version of Velvet that I have installed and that is in the path is 
velvet_1.0.18.

I was wondering if anyone could give me some help here or has any suggestions.

Many thanks,
Graham


Dr. Graham Etherington
Bioinformatics Support Officer,
The Sainsbury Laboratory,
Norwich Research Park, 
Norwich NR4 7UH.
UK




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [Nbicgalaxy-admin] RPM repository for NGS tools in Galaxy

2011-03-04 Thread James Taylor
This would be great. The 'tool dependency injection' part of Galaxy is  
designed so any directory having this structure will work, and you can  
have as many as you want and they will be searched in order.


On Mar 4, 2011, at 3:21 AM, David van Enckevort wrote:

The repository with the latest versions is used for people who just  
want to always have the latest versions, the repository with the  
versioned RPMs is used if you want to pin to specific versions. The  
versioned repository could install with the same $GALAXY_APPS/ 
package/version/ structure as you use.


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] custom datatypes

2011-03-04 Thread Daniel Blankenberg
Hi Glen,

Sorry for the delay in response.

 And how do I limit selections in an input drop down to just my specific file 
 type?  I'm guessing I need to extend the Tabular class, but I don't need to 
 add any additional functionality at this point, I just want to limit how the 
 tools can be chained together. 


We added the ability to dynamically create subclasses in changeset 
5176:34d3fcd8037b, by adding subclass=True to the datatypes_conf.xml file. 
But your guess is correct, before this change you would need to have created a 
dummy do-nothing class.

 when I load data with this tool the format is set to tabular by galaxy and 
 not to my custom type.  I have a  feeling Galaxy is ignoring what the tool 
 .xml says the format of the output is and tries to autodetect, and comes up 
 with tabular.  If a data source says the output is a specific type shouldn't 
 galaxy use that as the format rather than autodetect?


There is a lot of legacy magic going on with datatype detection in datasource 
tools. We're currently working on cleaning this up and making it behave more 
sanely (e.g. make better use of the provided format attribute of the output 
dataset).  However, the preferred method of setting the datatype in datasource 
tools is to provide a 'data_type' parameter to Galaxy. If the datasource cannot 
provide this information, you can have Galaxy create it by providing a 
request_param_translation -- request_param tag set in the configuration .xml 
file for the datasource tool. An example:
request_param_translation
request_param galaxy_name=data_type remote_name=data_type 
missing=my_datatype /
/request_param_translation

Please let us know if we can provide additional information or help in any 
other way. Sorry again for the delay.


Thanks for using Galaxy,

Dan


On Feb 23, 2011, at 3:00 PM, Glen Beane wrote:

 I've created a custom file type based on the Galaxy Tabular type  (this is so 
 some tools I'm developing can declare this type as input or output and 
 prevent any arbitrary tabular file from being used as input for a tool)
 
 I have a data source tool that declares its output format to be my type (this 
 tool brings the user to a website where they query a database and then sends 
 the data file back to galaxy)
 
 when I load data with this tool the format is set to tabular by galaxy and 
 not to my custom type.  I have a  feeling Galaxy is ignoring what the tool 
 .xml says the format of the output is and tries to autodetect, and comes up 
 with tabular.  If a data source says the output is a specific type shouldn't 
 galaxy use that as the format rather than autodetect?
 
 Also,  I have other tools that declare my type as the input type, yet when I 
 go to select an input file it shows me all tabular files in my history, not 
 just those with my custom type.
 
 
 how do I get the format to be correctly set for a file I load from my 
 data_source tool?  And how do I limit selections in an input drop down to 
 just my specific file type?  I'm guessing I need to extend the Tabular class, 
 but I don't need to add any additional functionality at this point, I just 
 want to limit how the tools can be chained together. This datatype is 
 probably just a place holder for what will probably end up being a binary 
 type, so I'm not going to put much effort in. 
 
 
 --
 Glen L. Beane
 Software Engineer
 The Jackson Laboratory
 Phone (207) 288-6153
 
 
 
 
 ___
 To manage your subscriptions to this and other Galaxy lists, please use the 
 interface at:
 
  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Map with Bowtie for Illumina - multiple input fastqs

2011-03-04 Thread Nicki Gray

Hi

When running bowtie on the command line I was able to use more than  
one fastq file as input simply by listing them separated by a comma eg:


bowtie -p 3 -q -m 2 --best --strata --sam --chunkmb 256 /databank/ 
indices/bowtie/hg18/hg18 input1.fastq,input2.fastq output.sam


How can I do this within galaxy? The Map with Bowtie for Illumina  
tool only allows for 1 input fastq file as far as I can see.


Thanks, Nicki
-
Nicki Gray
MRC Molecular Haematology Unit
01865 222434

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Map with Bowtie for Illumina - multiple input fastqs

2011-03-04 Thread Kelly Vincent

Nicki,

You are right that Galaxy's Bowtie only allows one input fastq. You  
would have to combine your multiple input files into one before  
running it in Galaxy. You can do this with the Concatenate datasets  
tool (under Text Manipulation).


Let us know if you have any further questions.

Regards,
Kelly


On Mar 4, 2011, at 11:49 AM, Nicki Gray wrote:


Hi

When running bowtie on the command line I was able to use more than  
one fastq file as input simply by listing them separated by a comma  
eg:


bowtie -p 3 -q -m 2 --best --strata --sam --chunkmb 256 /databank/ 
indices/bowtie/hg18/hg18 input1.fastq,input2.fastq output.sam


How can I do this within galaxy? The Map with Bowtie for Illumina  
tool only allows for 1 input fastq file as far as I can see.


Thanks, Nicki
-
Nicki Gray
MRC Molecular Haematology Unit
01865 222434

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] upload large data file

2011-03-04 Thread Ry4an Brase
On Fri, Mar 04, 2011 at 04:28:20PM +, Yanji Xu wrote:
 Dear Sir/Madam,
 
 I installed galaxy in my local server, then I tried to upload a 4.7 Gb
 fastq file into galaxy, but failed.  Below is the error message.
 
 OverflowError: signed integer is greater than maximum
 
 How could I upload large data files into galaxy and process the data?

Use either the Upload from filepath mechanism available for data
libraries (
https://bitbucket.org/galaxy/galaxy-central/wiki/DataLibraries/UploadingFiles)
which has you copy the file to the server in advance and then import it,
or setup the Upload via FTP functionality (
https://bitbucket.org/galaxy/galaxy-central/wiki/UploadViaFTP ).

-- 
Ry4an Brase 612-626-6575
Software Developer  Application Development
University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/