FYI I’ve implemented it locally for all modules of xwiki-commons and did some 
build time measurements:

* With pitest/descartes: 37:16 minutes
* Without pitest/descartes 5:10 minutes

So that’s a pretty important hit….

So I think one strategy could be to not run pitest/descartes by default in the 
quality profile (i.e. have it off by default with 
<xwiki.pitest.skip>true</xwiki.pitest.skip>) and run it on the CI, from time to 
time, like once per day for example, or once per week.

Small issue: I need to find/test a way to run a crontab type of job in a 
Jenkins pipeline script. I know how to do in theory but I need to test it and 
verify it works. I still have some doubts ATM...

WDYT?

Thanks
-Vincent

> On 15 Mar 2018, at 09:30, Vincent Massol <[email protected]> wrote:
> 
> Hi devs,
> 
> As part of the STAMP research project, we’ve developed a new tool (Descartes, 
> based on Pitest) to measure the quality of tests. It generates a mutation 
> score for your tests, defining how good the tests are. Technical Descartes 
> performs some extreme mutations on the code under test (e.g. remove content 
> of void methods, return true for methods returning a boolean, etc - See 
> https://github.com/STAMP-project/pitest-descartes). If the test continues to 
> pass then it means it’s not killing the mutant and thus its mutation score 
> decreases.
> 
> So in short:
> * Jacoco/Clover: measure how much of the code is tested
> * Pitest/Descartes: measure how good the tests are
> 
> Both provide a percentage value.
> 
> I’m proposing to compute the current mutation scores for xwiki-commons and 
> xwiki-rendering and fail the build when new code is added that reduce the 
> mutation score threshold (exactly the same as our jacoco threshold and 
> strategy).
> 
> I consider this is an experiment to push the limit of software engineering a 
> bit further. I don’t know how well it’ll work or not. I propose to do the 
> work and test this for over 2-3 months and see how well it works or not. At 
> that time we can then decide whether it works or not (i.e whether the gains 
> it brings are more important than the problems it causes).
> 
> Here’s my +1 to try this out.
> 
> Some links:
> * pitest: http://pitest.org/
> * descartes: https://github.com/STAMP-project/pitest-descartes
> * http://massol.myxwiki.org/xwiki/bin/view/Blog/ControllingTestQuality
> * http://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
> 
> If you’re curious, you can see a screenshot of a mutation score report at 
> http://massol.myxwiki.org/xwiki/bin/download/Blog/MutationTestingDescartes/report.png
> 
> Please cast your votes.
> 
> Thanks
> -Vincent

Reply via email to