Re: categorizing export tests

Guenter Milde Mon, 23 Nov 2015 05:11:07 -0800

Dear Kornel and Scott (and everyone else interested in export tests),

On 2015-11-22, Kornel Benko wrote:



> We apparently don't understand each other. 

Indeed, there is a misunderstanding, but I believe to be a bit wiser now.
Please correct me where I am still wrong.


> For me, ignored test is ignored, that is, the test is not even created.

> The ctest machinery (not our usage of it) runs _all_ created tests with
> 'ctest' (without any parameter).

For me, a "test case" or "potential test" is some a combination of

 - document
 - output format
 - scripted changes (like systemF: "set the used font to non-TeX-fonts")

The general rule: 

  Create a "test instance" for
  * every document matching  lib/(doc|templates|examples)/*.lyx!
  * with every output format (dvi.?|ps|pdf.?|html)
  * with system Fonts and TeX Fonts (for dvi3|pdf4|pdf5)
 
expands to a list of "potential tests" where either

 a) expect successful export,
 b) expect an export error, or
 c) don't care (because whether there is an export error does not
    correlate with a (new) problem in LyX)

Case a) is the default. 
Exceptions are defined via regular expressions in

 b) development/autotests/revertedTests
 c) development/autotests/ignoredTests


The usual test run after a commit should detect regressions and therefore
run all "potential tests" in a) and b) but not c). 
This means it must not "create a test" for for the "to be ignored
possible tests" in c).



Up to here, I hope we can agree.


Now to the differences. I hope we can get them sorted out -- maybe this are
also just misunderstandings?


On 2015-11-20, Kornel Benko wrote:
> Am 20. November 2015 um 11:20:54, schrieb Guenter Milde <mi...@users.sf.net>

...
>> * ignored           # we don't care for the result and hence don't run
>>                     # these test cases
>    Yes

>>   - wrong output    # the output is wrong although export returns success.
>>                            # not LyX's fault, but e.g. incompatible packages.
>    OK

>>   - nofix           # "historic" packages with bugs that prevent working
>>                            # with some export routes.
>    OK

>>   - nonstandard     # requires packages or similar that are not on CTAN
>    Do not ignore them. They _are_ compilable at the end.

>>   - suspended       # - non-LyX bugs that may be resolved (works depending on
>>                     #   TeXLive version).
>                               like XeTeX ... haha

>>                    # - problems that we currently cannot solve but want to.

>    Yes, but not ignore.


Here, we do have 2 categories where the suggestion to ignore them was met by
the objection

> To make it clear: Everything ignored cannot be tested. 
> If we want to see, if anything changed (like XeTeX), we should be able
> to retest.

My response that I consider this limitation a fundamental flaw in the test
machinery because

>> We need a category and rule-set for tests where:

>> * we don't care for the result because it does not tell us anything about
>>   the "healthiness" of LyX and hence don't run them normally, but

>> * we may want to run them on special request (because we know the phase of
>>   the moon or have installed a special package or want to check the status
>>   of upstream packages or fixed a nofix bug).

was answered with

> In this case we can just compare the before results with the results
> after clearing the ignoredTests file.

Well, then it should be possible to 

* have regular expressions for "potential tests" that meet the criteria
  for "suspended" or "nonstandard" in development/autotests/ignoredTests
  and 
* clear or override them when there is a special interest in the (normally
  irrelevant) export failure or success.


Maybe an command line option (--force, say) to bypass the check against
development/autotests/ignoredTests



My vision for the output of a "normal" test run would be

* report all tests that fail our expectations, i.e.

  - in category a) if there is an export error 
  - in category b) if there is no export error
  
  ideally with error message for a).
  
  These failing tests would be regressions calling for action
  (solve the problem or suspend the test case).
  

* optionally, report "suspended" tests

  This would be summary of postponed TODO items.
  


Günter

Re: categorizing export tests

Reply via email to