Re: categorizing export tests

Kornel Benko Thu, 26 Nov 2015 11:55:47 -0800

Am Donnerstag, 26. November 2015 um 11:23:46, schrieb Guenter Milde 
<mi...@users.sf.net>
> On 2015-11-23, Kornel Benko wrote:
> > Am 23. November 2015 um 16:38:20, schrieb Guenter Milde <mi...@users.sf.net>
> 
> ...
> 
> > Maybe you could propose different names?
> 
> The following proposal for an export test case categorisation tries to avoid
> the controversial terms "inverted/reverted", "suspended", and "ignored".
> 
> Instead, the basic distinction is between "good" tests and "known problems".
> 
> We have to distinguish 2 modi of working:
> 
> a) test for regressions
> 
>    The results of all tests with "known problems" are irrelevant
>    when testing for regressions.
> 
> b) maintaining the test suite
> 
>    Update the list of known problems. 
>    Here, we need to know which test cases with known problems fail or pass.
> 
> 
> While the concept of "known problems" matches roughly to "inverted", there
> are some differences:
> 
> * tests with "known problems" usually fail, but may also pass.
> 
> * a line 
> 
>     KNOWN_PROBLEM.<subtag>.export/...
> 
>   is easier to understand than
> 
>     INVERTED-SEE-README.export/...


Hm, yes. But the first entries are not in any subcategory (no subtag).

> * There is no top-level category "ignored". 

Sure, and will never be. Please accept 'ignored' as a synonym for 'discarded', 
'non existant' or 'senseless'
in export test context.
If a test is designed for html, the possibility for creating pdf test is 
ignored.
No sense to test such a beast.

>   Whether test instances are created or not is a feature of the
>   subcategories based on practicability.
>   
>   "known problems.wontfix" is a rough equivalence but we could extempt other
>   subcategories, too.

We could add 'wontfix' to inverted, yes.

> * There is no need for a top-level category "unreliable".

I added it to please you ... :(

>   As we allow test cases with known problems to pass, "nonstandard" and 
>   "erratic" can be made sub-categories of "known problem".

So essentially you wish to rename label unreliable to knownproblem.

>   (If it eases maintenance, they could, however, also be sorted under
>    a top-level "unreliable".)
> 

Now I am totally confused.

> 
> Export Test Categorisation
> --------------------------
> 
> Export tests are generated by taking sample documents, eventually modifying
> them

In practice they are always modified. For instance each reference to a local 
image has to be changed,
the image itself is to be copied to some temporary directory, included 
lyx-files (or tex-files) are to be copied and modified and so on.

> and calling LyX to export them to a given output format.
> This results in $N_{documents} x N_{modifications} x N_{output formats}$
> possible combinations (test cases) which can be sorted in two main
> categories:
> 
> * good                # we expect the export to succeed
> 
> * known problems      # export may fail for a known reason
>
> When testing for regressions, test cases with "known problems" can be
> ignored. Creating/running tests with "known problems" is not required. We
> don't need to know whether they fail or not.

We need. The problem may have gone.
From then on they will be 'good'.

> If all "good" tests pass we have reached a clean state, while
> "good" test cases that fail require action.
> 
> 
> OTOH, to find out if any "known problem" is solved, we need to run the
> respective test(s).

That's what I was trying to say.

> This means we have to compromise between resource-efficiency (not running
> tests when we are not interested in the result) vs. ease of use (make it
> easy to re-check (some) of the tests with "known problems").
> 

OK.

> To get a feel for the severity of a known problem, it makes sense to 
> sort known problems in sub-categories, e.g.
> 
> 
> * TODO            # problems we want to solve but currently cannot.
> 
> * minor           # problems that may be eventually solved
> 
> * wontfix         # LyX problems with cases so special we decided to 
>                 # leave them, or LaTeX problems that 
>                   # - can't be solved due to systematic limitations, or
>                   # - are bugs in "historic" packages no one works on.
>                     
> * wrong output    # the output is corrupt, LyX should rise an error
>                   # but export returns success.
> 
> * LaTeX bug       # problems due to LaTeX packages or other "external"
>                   # reasons (someone else's problems).
>                 # that may be eventually solved
>                   # (In this case, the case goes to "unreliable" until
>                   # everyone has the version with the fix.)
>                     
> * nonstandard     # requires packages or other resources that are not on CTAN
>                   # (some developers may have them installed)
>                     
> * erratic         # depending on local configuration, OS, TeX distribution,
>                   # package versions, or the phase of the moon.
> 

Feels good, but who shall categorize?

> 
> The following label is independent and can be given in addition to the above
> categories.
> 
> * fragile           # prone to break down

Needs implementing, but doable.
ATM, only single sublabels are considered.

> We could use "generic" regular expressions to give hints about documents or
> export routes where simple changes may lead to failure.
> 
> This can be 
> 
>   - problematic documents that use heavy ERT or preamble code or
>     many/unstable packages (e.g. Math.lyx, Additional.lyx).
>   
>   - poorly supported and seldom used export formats 
>     (e.g. XeTeX + TeX-fonts)
> 
> If a "fragile" test case fails that formerly was OK, chances are
> high that this is not a regression but due to an existing problem.
> 

The problem is the huge number of tests which do not fail. They are not 
categorized ATM.
Some work is needed

> 
> If we want to make sure that no "good fail" is transformed to a
> "wrong output" we would need a category "assert fail" and report
> export without failure:
> 
> * assert fail     # we know the export does not work for a permanent reason
>                   # and want to test whether LyX correctly fails
>                   # (e.g. pdflatex with a package requiring LuaTeX)
> 

That is for later, used in autotests/export.
All other lyx-files (but attic) are distributed. Normally we expect them in 
good shape.

> 
> Günter

        Kornel

signature.asc
Description: This is a digitally signed message part.

Re: categorizing export tests

Reply via email to