As yet another proof that IT people always solve things in similar ways, see this interesting blog post by one of our "competitors":
http://blog.idrsolutions.com/2013/06/save-time-test/

Tilman

Am 04.07.2014 23:05, schrieb Petr Slabý:
Hi,
following is a description of what we are doing in our company.

With our software, we run regression tests after each nightly build and sometimes it is a tough fight. If there is a regression, it is not so easy to find which commit caused it, because there are potentially many between the nightly builds. Then, the decision whether the change is wanted and expected is in some cases also difficult (this part might be easier with PDF where there is the "golden standard" rendering in Acrobat). If the change is expected and the new rendering "better" then one has to commit the new reference. This means that the files produced on the nightly build machine must be available somehow - it is almost impossible to produce them locally as the rendering results are slightly different with different versions of java and many other reasons. All this has to be done before the next regression test is run to avoid that new regressions are hidden by earlier ones. Our complete build with all tests runs several hours...

To improve this workflow, we now use the following schema in addition:
- there is a smaller set of regression tests which runs relatively fast
- these tests are triggered by each commit in formatting and rendering related projects - before running the test itself, the modified project(s) are compiled locally, w/o publishing the result to maven
- the reference rendering files are stored in SVN
- if a test finds a regression, it immediately stores the new result as a new reference into SVN. This makes sure that a) the test renderings do not get lost and b) that each regression exactly points to the commit that has caused it - the one that triggered the test. The failed test creates a new issue in JIRA with a pointer to SVN to the before and after rendering and a bitmap of the differencies. The issue is then processed. If we find the change to be expected then the issue is simply closed, otherwise we take actions to fix the problem. The only annoying thing about this scheme is that, after commiting the correction, the test runs again and reports a regression because it now compares to the faulty version of the rendering.

Best regards,
Petr.

-----Původní zpráva----- From: John Hewson
Sent: Friday, July 04, 2014 7:39 PM
To: dev@pdfbox.apache.org
Subject: Re: Regression Testing

Hi Tilman

Thanks for your thoughts, I think that your concerns are already covered by my original proposal, I’ll try to explain why and how:

Of course I agree with the need for regression tests, however it isn't easy: besides the problems of the different JDKs (I use JDK7 Windows 64 bit), there is the problem that some enhancements create slight changes in rendering that are not errors, i.e. both the "before" and the "after" files look OK by itself. This has happened when we changed the text rendering recently, and has happened again when the clipping was improved. The cause are probably slight changes in color or in boundaries.

If a rendering has changed then the regression test should fail. When a failure occurs the developer needs to manually inspect the differences (we could generate a visual diff which highlights what changed to make this easier) and if ok then they can replace the known-good PNG with the ones just rendered. Indeed this will be the basic workflow for working with regression tests.

Copyrights is a problem: I'm testing mostly with JIRA attachments that I've downloaded over the years. While uploading such files to JIRA might count as fair use, I doubt that this would still be true if they are included in a distribution. Instead, they should be stored somewhere on Apache servers where only committers and build software ("Travis", "Jenkins", ...) can access then. The public PDFs that Maruan mentions don't possibly have all the Problem cases that we solved before. However I have started working with these files and there are at least 5 recent issues that deals with them.

The PDFs won’t be in a distribution. They will just happen to be stored in an SVN repo but not our source code repo, in the same way that the website is stored in the “cmssite” branch of SVN or indeed, are on JIRA. The law doesn’t distinguish between JIRA and SVN, both are publicly available via HTTP, so using SVN will simply be a continuation of what we’re already doing with JIRA.

The crucial factor is that we’re only storing publicly available PDFs, because we have the right to do so, just like Google’s cache, and like we currently do with JIRA.

Additionally, the PDFs need to be version controlled otherwise we won’t be able to reliably recreate previous builds, so storing the files on a web server won’t be practical. Also committers will frequently be updating the renderings as bugs are fixed and we’ll need to version-control the rendered PNG files for the same reason. Finally, having committers-only files doesn’t fit well with the Apache goal of open development and would be unnecessary anyway given that all the PDFs are to be taken from public sources only.

In summary, I’m proposing that we just keep doing what we’re currently doing with JIRA but we move it into its own SVN repo along with some pre-rendered PNGs.

Re preflight: the default mode should be to have the Isartor tests on. Individuals could still disable them locally, but the central build software should always use them.

Yes - does anybody know why this isn’t the default?

-- John

Reply via email to