Before we go too much further down this path with dataytpes, I'm wondering if 
some of us should put together a spec of some kind that allows us to all agree 
on the direction.  For example, I'm wondering if datatyps should be versioned 
and have a name-spaced identifier much like the Tool Shed's guid identifier for 
tools.  I haven't thought too much about whether this would pose backward 
compatibility issues or not.   Discussion is welcomed on this.

Greg Von Kuster

On Jul 22, 2014, at 7:19 PM, Greg Von Kuster <> wrote:

> Hi Björn,
> On Jul 22, 2014, at 6:01 PM, Björn Grüning <> wrote:
>> Hi Greg,
>> thanks for the clarification. Please see my comments below.
>>> On Jul 20, 2014, at 3:22 PM, Peter Cock <> wrote:
>>>> On Sun, Jul 20, 2014 at 6:23 PM, Björn Grüning <> wrote:
>>>>> Hi,
>>>>> single datatype definitions only work if you haven’t defined any 
>>>>> converters.
>>>>> Let's assume I have a datatype X and want to ship a X -> Y converter (Y 
>>>>> -> X
>>>>> is also possible), we will end up with a dependency loop, or? The X
>>>>> repository will depend on the Y repository, but Y is depending on X, 
>>>>> because
>>>>> we want to include a Y -> X converter.
>>>>> Any idea how to solve that?
>>> I don't see a problem here, so I'm hoping I'm correctly understanding the 
>>> issue.
>>> If we have:
>>> repo_x contains the single datatype X
>>> repo_y contains the single datatype Y
>>> repo_x_to_y_converter contains a tool that converts datatype X to datatype 
>>> Y (this repository also defines 2 dependency relationships, one to repo_x 
>>> and another to repo_y)
>>> repo_y_to_x_cenverter contains a tool that converts datatype Y to datatype 
>>> X (this repository also defines 2 dependency relationships, one to repo_x 
>>> and another to repo_y)
>>> Now if we want to install both the repo_x_to_y_converter and the 
>>> repo_y_to_x_cenverter automatically whenever either one is installed, we 
>>> have 2 options:
>>> 1) define a 3rd dependency relationshiop for repo_x_to_y_converter to 
>>> depend on repo_y_to_x_cenverter and, similarly a 3rd dependency 
>>> relationshiop for repo_y_to_x_cenverter on repo_x_to_y_converter.  This 
>>> does indeed
>>> create a circular repository dependency relationship, but the Tool Shed 
>>> installation process will handle it correctly, installing all 4 
>>> repositories with proper dependency relationships created between them
>> Does that mean, circular dependencies will be no problem at all?
> Yes, the Tool Shed handles circular dependency definitions of any variety, so 
> circular dependency definitions pose no problem.
>> Do you consider including the converters into the datatypes as 
>> best-practise? (These converters are implicit-galaxy-converters).
>> I would have only two repositories with circular dependencies.
> Yes, however, there are some current limitations in the framework detailed on 
> this Trello card:
> Tag sets like the following that are defined in a datatypes_conf.xml file 
> contained in a repository should be correctly loaded into the in-memory 
> datatypes registry when the repository is instlled into Galaxy.  However, it 
> has been quite a while since I've worked in this area, so let me know if you 
> encounter any issues.  The current best practice is probaly that the 
> converters themselved would each individually be in separate repositories 
> (just like all Galaxy tools), but this can certainly be discussed if 
> appropriate.  Community thoughts are welcome here!
>   <datatype extension="bam" type="galaxy.datatypes.binary:Bam" 
> mimetype="application/octet-stream" display_in_upload="true">
>     <converter file="bam_to_bai.xml" target_datatype="bai"/>
>     <converter file="bam_to_bigwig_converter.xml" target_datatype="bigwig"/>
>     <display file="ucsc/bam.xml" />
>     <display file="ensembl/ensembl_bam.xml" />
>     <display file="igv/bam.xml" />
>     <display file="igb/bam.xml" />
>   </datatype>
>>> 2) Instead of creating a circlular dependency relationship between 
>>> repo_x_to_y_converter and repo_y_to_x_cenverter, create an additional 
>>> suite_definition_x_y repository (of type "repository_suite_definition" that 
>>> defines relationships to repo_x_to_y_converter and  repo_y_to_x_cenverter, 
>>> ultimately installing all 4 repositories, but without defining any circular 
>>> dependency relationships.
>> repo_x_to_y_converter and repo_y_to_x_converter would have dependencies on 
>> datatype X and Y, so I do not see the need for a suite_definition ... or it 
>> is some collection like the emboss_datatypes …
> I agree.
>> My scenario is more that the converters are not tools, they are implicit 
>> converters and should _not_ be displayed in the tool panel.
>> As far as I know they need to be defined inside the datatypes_conf.xml file.
> Yes, they must be defined inside the datatypes_conf.xml file.  However, 
> converters are just special Galaxy Tools (they are "special" in the same way 
> that Data Manager tools are special).  They are loaded into the in-memory 
> Galaxy tools registry, but not displayed in the tool panel.
>> I think if circular dependencies are not a problem I will try to implement a 
>> proof of concept. EMBOSS is now splitted:
> Sounds goos - circular dependencies should pose no problems.
>> Thanks Greg!
>> Bjoern
>>> Either of the above 2 scenarios will correctly install the 4 repositories.
>>> Let me know if I'm missing something here.
>>> Thanks!
>>> Greg
>>>> Excellent example!
>>>>> How to handle versions of datatypes? Extra repositories for stockholm 1.0
>>>>> and 1.1? If so ... the associated python file (sniffing, splitting ...)
>>>>> should be also versioned, or? What happend if I have two 
>>>>> files
>>>>> in my system?
>>>> Potentially you might need/want to define those as two different
>>>> Galaxy datatypes?
>>>>> @Peter, can we create a striped-down, python only biopython egg? All 
>>>>> parsers
>>>>> should be included, Bio.SeqIO should be sufficient I think.
>>>> Right now, yes in principle (and this is fine from the licence point of 
>>>> view),
>>>> but in practise this is a fair chunk of work. However, we are looking at
>>>> this - see
>>>> Peter
>>>> ___________________________________________________________
>>>> Please keep all replies on the list by using "reply all"
>>>> in your mail client.  To manage your subscriptions to this
>>>> and other Galaxy lists, please use the interface at:
>>>> To search Galaxy mailing lists use the unified search at:
>>> _______________________________________________
>>> galaxy-iuc mailing list
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> To search Galaxy mailing lists use the unified search at:

Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

To search Galaxy mailing lists use the unified search at:

Reply via email to