Re: Plan for the current testing situation

Kornel Benko Mon, 02 Nov 2015 12:27:07 -0800

Am Sonntag, 1. November 2015 um 22:41:39, schrieb Scott Kostyshak 
<skost...@lyx.org>
> Thanks to all of those participating in the discussions about tests. I have
> learned a lot the last couple of weeks. Thank you also to those who have tried
> to run the tests. This to me is a great step forward. I know that the export
> tests are sloppy cheap tests, but I appreciate that some of you agree that 
> they
> do have use, at least until we have unit testing. The question now is, how can
> we get the most use out of them and how can we maximize their signal to noise
> ratio?
> 
> Ideally, I would like for commits that break tests to be on a separate git
> branch. Once the bugs exposed by a commit are fixed or the tests are fixed, or
> the .lyx files are fixed, then the branch could be merged to master. This
> allows us to presere a "0 failing tests" situation on the master branch so it
> is extremely easy to catch regressions. So my preferred policy would be: if a
> commit is found to have broken a test, either the situation is resolved within
> a day (i.e. the bug is fixed or the test is fixed) or the commit is reverted
> (and perhaps pushed to a separate remote branch). The only way I think there
> will be support for this policy is if the tests are not fragile. If tests
> commonly break even though a commit does not introduce a real bug, then this
> policy would just create a lot of pain without much use. I think that
> non-fragile export tests are a real possibility, but some changes might be
> needed to achieve this.


+1
But as you said, 'ideally'.

> Back from my dream world, the problem with the current situtation is that so
> many tests are failing (200 or so) that it makes it difficult to see if there
> are new regressions. Especially as we move forward in the release process,
> catching new regressions is extremely important. I thus propose some temporary
> measures to get a clean test output. I think that the tests that are failing
> because of known bugs should be inverted. For example, the tests that are
> failing because of the change from babel to polyglossia could be inverted and
> marked with the reason why they are inverted, and referred to in a bug report.
> Georg (who did not introduce the regressions but is kindly working on them) is
> aware of the failing tests, and is also working on better tests that are
> specific to testing language nesting, so in some sense the tests have already
> done their job and we should move on for now.

+1

> A source of fragile tests is the XeTeX tests with TeX fonts. Günter has 
> pointed
> out that exporting using XeTeX and TeX fonts could go from succeeding to
> failing because of an added package, addition of one non-ASCII character, 
> babel
> language files, font problems, and other issues. Further, we should not expect
> that many users are using XeTeX with TeX fonts because it is not common (there
> is a separate discussion we are having on discouraging such a combination).
> Thus, a good question is should we even have tests for this combination?

IMHO yes. Even if they may be 'suspended'. At least even XeTeX may evolve and 
the tests
are ready to wake up.

> It seems to me that the underlying causes of the XeTeX + TeX fonts tests will
> not be fixed anytime soon (and that this is OK) so we have two options if we
> want to clean up the test output. We can either invert the current test
> failures or ignore all XeTeX TeX font tests. Ignoring the tests means that 
> they
> will not be run. Whether we invert or ignore basically depends on how low we
> think the signal to noise ratio is. I am open to both possibilities, depending
> on what others prefer. I do have a preference for inverting the current tests
> (and not ignoring).

I offered a solution already. Test are going to 'revertedTests'.
Through the file 'suspendedTests' we can select which of them should be 
suspended.

That way we have
        ctest -L export # to run all but suspended tests
        ctest -L suspended # runs only suspended (which are also inverted)

> I'm not convinced that the signal to noise ratio is *so*
> low. If there is a significant chance that these tests will catch bugs that 
> our
> other tests should not, they should be kept. I thus propose to invert the
> current failures (and specify the reason each is failing) and then to treat
> future XeTeX TeX font failures as failures of fragile tests. In some sense,
> fragile tests can be OK if we know they are fragile and treat them as such. We
> could amend our policy to be such that if XeTeX + TeX font tests fail, the
> author of the commit that broke them can simply invert them without having the
> responsibility of fixing them. The advantage of still keeping them (and not
> "ignoring" them) is that the author of the commit might realize "my commit
> should not have broken these tests" and thus find a bug in the commit. For
> example, many commits do not even intend to change LaTeX. You might think that
> other tests would catch most bugs, but sometimes only a couple of tests fail
> because of an obscure bug. For example, a couple of weeks ago a commit caused
> export of our Japanese knitr and Sweave manuals to hang (and only these
> manuals). I don't even know if the output of those manuals is correct but the
> tests still helped in preventing a regression and if those tests had been
> ignored the tests would not have caught the regression.
> 
> So in summary, regarding the XeTeX + TeX fonts, I propose the above policy to
> start with; and if we find that there really is such a low signal to noise
> ratio then we can change our minds and ignore them. But we will never know 
> what
> the signal to noise ratio is if we just ignore them now.
> 
> Any thoughts?
> 
> Scott

        Kornel

signature.asc
Description: This is a digitally signed message part.

Re: Plan for the current testing situation

Reply via email to