Re: [galaxy-dev] Missing test results on (Test) Tool Shed

Greg Von Kuster Wed, 08 May 2013 07:28:48 -0700

Hi Peter,

On May 8, 2013, at 6:45 AM, Peter Cock wrote:

> On Tue, May 7, 2013 at 7:02 PM, Greg Von Kuster <g...@bx.psu.edu> wrote:
>> Hi Peter,
>> 
>> Missing test components implies a tool config that does not define a
>> test (i.e, a missing test definition) or a tool config that defines a test,
>> but the test's input or output files are missing from the repository.
> 
> This seems to be our point of confusion: I don't understand combining
> these two categories - it seems unhelpful to me.

I feel this is just a difference of opinion.  Combining missing tests and 
missing test data into a single category is certainly justifiable.  Any 
repository that falls into this category clearly states to the owner what is 
missing, and the owner can easily know that work is needed to prepare the 
repository contents for testing, whether that work falls into the category of 
adding a missing test or adding missing test data.

> 
> Tools missing a test definition clearly can't be tested - but since we'd
> like every tool to have tests having this as an easily view listing is
> useful both for authors and reviewers.

But is is an easily viewed listing.  It is currently very easy to determine if 
a tool is missing a defined test, is missing test data, or both.

> It highlights tools which need
> some work - or in some cases work on the Galaxy test framework itself.
> They are neither passing nor failing tests - and it makes sense to list
> them separately.
> 
> Tools with a test definition should be tested

This is where I disagree.  It currently takes a few seconds for our check 
repositories for test components to crawl the entire main tool shed and set 
flags for those repositories missing test components.  However, the separate 
script that crawls the main tool shed and installs and tests repositories that 
are not missing test components currently takes hours to run even though less 
than 10% of the repositories are currently tests (due to missing test 
components on most of them).

Installing a testing repositories that have tools with defined tests but 
missing test data is potentially costly from a time perspective.  Let's take a 
simple example:

Repo A has 1 tool that includes a defined test, but is missing required test 
data from the repository.  The tool in repo A defines 2 3rd party tool 
dependencies that must be installed and compiled.  In addition, repo A defines 
a repository dependency whose ultimate chain of repository installations 
results in 4 additional repositories with 16 additional 3rd party tool 
dependencies, with a total installation time of 2 hours.  All of this time is 
taken in order to test the tool in repo A when we already know that it will not 
succeed because it is missing test data.  This is certainly a realistic 
scenario.

> - if they are missing an
> input or output file this is just a special case of a test failure (and can
> be spotted without actually attempting to run the tool).

Yes, but this is what we are doing now.  We are spotting this scenario without 
installing the repository or running any defined tests by running the tool.

> This is clearly
> a broken test and the tool author should be able to fix this easily (by
> uploading the missing test data file)

Yes, but this is already possible for them to clearly see without having to 
install the repository or run any tests.

> 
>> I don't see the benefit of the above where you place tools missing tests
>> into a different category than tools with defined tests, but missing test 
>> data.
>> If any of the test components (test definition or required input or output 
>> files)
>> are missing, then the test cannot be executed, so defining it as a failing
>> test in either case is a bit misleading.  It is actually a tool that is 
>> missing
>> test components that are required for execution which will result in a pass
>> / fail status.
> 
> It is still a failing test (just for the trivial reason of missing a
> test data file).
> 
>> It would be much simpler to change the filter for failing tests to include
>> those that are missing test components so that the list of missing test
>> components is a subset of the list of failing tests.
> 
> What I would like is three lists:
> 
> Latest revision: missing tool tests
> - repositories where at least 1 tool has no test defined
> 
> [The medium term TO-DO list for the Tool Author]
> 
> Latest revision: failing tool tests
> - repositories where at least 1 tool has a failing test (where I include
>  tests missing their input or output test data files)
> 
> [The priority TO-DO list for the Tool Author]
> 
> Latest revision: all tool tests pass
> - repositories where every tool has tests and they all pass
> 
> [The good list, Tool Authors should aim to have everything here]
> 
> Right now http://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus
> would appear under both "missing tool tests" and "failing tool tests",
> but I hope to fix this and have this under "missing tool tests" only
> (until my current roadblocks with the Galaxy Test Framework are
> resolved).
> 
> I hope I've managed a clearer explanation this time,
> 
> Thanks,
> 
> Peter
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> http://lists.bx.psu.edu/
> 
> To search Galaxy mailing lists use the unified search at:
> http://galaxyproject.org/search/mailinglists/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Missing test results on (Test) Tool Shed

Reply via email to