On Oct 29, 2012, at 9:23 PM, Todd Oakley wrote:

> I changed the name of this thread, to go in a related by new direction:
> 
> I wonder if the Galaxy developers and community have any opinions on what is 
> the best way to organize tools into repositories. We've developed a large 
> number of tools to allow my lab to conduct phylogenetic analyses in Galaxy. 
> Inspired by the mothur package in Galaxy, which is all in one repo, I made 
> the decision to add all our related tools to 1 repo on the tool shed. 
> However, it seems that makes individual tools like raxml difficult to find 
> for other users.  Recently, we started putting these tools on to bitbucket, 
> and organizing them in different categories (alignment, phylogenetics, 
> orthologies, etc), which is a compromise between all-in-one-repo and 
> each-its-own-repo.
> 
> The thing is that many of the tools do not stand alone, and really are 
> designed to function with other tools in the package. Any philosophies or 
> opinions are welcome, as I feel like I have not come to a good solution on 
> this...
> 
> Todd
> 

I've added the following tool shed wiki page to provide discussion points 
related to this question.

http://wiki.galaxyproject.org/AToolOrASuitePerRepository

Here are the current contents of the page:

A single tool or a suite of tools per repository
Many tool developers in the Galaxy community question the best way to organize 
tools in their tool shed repositories. Some groups have developed a large 
number of tools to allow their labs to perform analyses in Galaxy and took the 
approach of including all related tools in a single repository in the tool 
shed.  Others have chosen to restrict each repository to include a single tool. 
 What is the "best practice"?

Both approaches are ok, but here are some points to consider when making this 
decision.  Notice that these points are valid at the time this page was 
written, so this discussion will evolve as new tool shed features and 
Galaxy-related tool shed features are introduced and mature over time.

The benefits of a single tool per repository

Restricting a repository to include a single tool provides more flexibility to 
Galaxy administrators to install only those specific tools in which their users 
have interest.  Sometimes installing a suite of tools in order to get only one 
or two of them is not optimal.
Some time in the future, Galaxy workflows may provide the ability to search for 
tools defined in the workflow that are not available in the Galaxy instance.  
Ideally the Galaxy administrator will be able to locate and install only the 
precise list of missing tools in order to enable the workflow to run.  For 
example, assume a user imported a workflow into their local Galaxy instance 
that was developed by someone else in a different Galaxy instance.  The Galaxy 
workflow UI may provide a feature that searches available tool sheds for the 
tools required by the imported workflow that are not available in the Galaxy 
instance.  Restricting repository contents to single tools would enable 
installation of only those missing tools required by the workflow.
Tool shed repository that include tools have certain mercurial change set 
revisions that are installable into a local Galaxy instance.  These revisions 
are defined by the versions of the tools included in the repository.  
Repositories that are restricted to contain a single tool will ensure that a 
new revision installation will be required only when that tool version changes. 
 Repositories that include multiple tools require a new installation revision 
when the version of any one of the tools changes, possibly resulting in 
multiple versions of the same tool installed into a single Galaxy instance.  Of 
course, Galaxy will load only a single instance of a tool version into the tool 
panel, but the tool and related files will still be installed on disk multiple 
times.
The weaknesses of a single tool per repository

With current tool shed features, if multiple tools share required third-party 
dependencies and you design your repository to install them when the repository 
is installed into a Galaxy instance (by including a file named 
tool_dependencies.xml in your repository), restricting a repository to a single 
tool will force you to include the same tool_dependencies.xml file in each 
repository whose contained tool requires the same dependency.  This will also 
install and compile the same dependencies separately for each repository when 
it is installed into a Galaxy instance.  In the near future, the tool shed will 
include a new feature that we are calling cross-repository dependencies which 
will eliminate this weakness.  This feature will provide a means of defining a 
repository as a dependency for another repository.  For example, the current 
emboss_datatypes repository in the main Galaxy tool shed will be defined as a 
dependency for the current emboss_5 repository in the same tool shed.  So when 
the emboss_5 repository is installed, the emboss_datatypes repository will be 
automatically installed along with it.
If tools are not intended to provide meaningful analyses on their own, but are 
designed to function with other tools, restricting a repository to a single 
tool will require a Galaxy administrator to install multiple repositories in 
order to provide all necessary tools to their users.

The benefits of a suite of tools per repository

With current tool shed features, if multiple tools share required third-party 
dependencies and you design your repository to install them when the repository 
is installed into a Galaxy instance (by including a file named 
tool_dependencies.xml in your repository), then all tools included in the 
repository can share the same third-party dependency, ensuring that the 
dependency only needs to be installed and compiled once for multiple tools.  
This benefit will be eliminated in the near future with the planned 
introduction of the cross-repository dependencies feature described above.
In some cases multiple tools are not intended to provide meaningful analyses on 
their own, but are designed to function with other tools in the suite.  In 
these cases, it makes sense for all tools to be installed into a Galaxy 
instance, and thus, the tools may all be included in a single repository.  In 
the near future, the tool shed will include a new feature that we are calling 
cross-repository dependencies (see above).  This feature will enable a 
repository to be defined as a "tool suite" of sorts, where the repository 
includes only atool_dependencies.xml file that defines multiple separate 
repositories that should be installed.  Each of these repositories could 
contain a single tool, allowing a Galaxy administrator to install each tool 
separately.  If the administrator chooses to install the "tool suite" 
repository, each separate repository would be automatically installed, 
providing the entire suite of tools with the single installation.  This new 
feature could ultimately eliminate the benefits of including a suite of tools 
in a single repository since, as discussed above, it will eliminate the issue 
of having to install and compile the same version of a tool dependency for each 
dependent tool in separate repositories.
The weaknesses of a suite of tools per repository

Sometimes installing a suite of tools in order to get only one or two of them 
is not optimal.  Restricting a repository to include a single tool provides 
more flexibility to Galaxy administrators to install only those specific tools 
in which their users have interest.  
Including multiple tools in a single repository may make individual tools more 
difficult to find with the current tool shed features.  Although it is 
currently possible to search for specific tools by partial or complete tool 
names and descriptions, the ability to browse for tools in the tool shed 
directly in addition to browsing for repositories is planned for the near 
future.

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to