FYI I’ve implemented it locally for all modules of xwiki-commons and did some build time measurements:
* With pitest/descartes: 37:16 minutes * Without pitest/descartes 5:10 minutes So that’s a pretty important hit…. So I think one strategy could be to not run pitest/descartes by default in the quality profile (i.e. have it off by default with <xwiki.pitest.skip>true</xwiki.pitest.skip>) and run it on the CI, from time to time, like once per day for example, or once per week. Small issue: I need to find/test a way to run a crontab type of job in a Jenkins pipeline script. I know how to do in theory but I need to test it and verify it works. I still have some doubts ATM... WDYT? Thanks -Vincent > On 15 Mar 2018, at 09:30, Vincent Massol <[email protected]> wrote: > > Hi devs, > > As part of the STAMP research project, we’ve developed a new tool (Descartes, > based on Pitest) to measure the quality of tests. It generates a mutation > score for your tests, defining how good the tests are. Technical Descartes > performs some extreme mutations on the code under test (e.g. remove content > of void methods, return true for methods returning a boolean, etc - See > https://github.com/STAMP-project/pitest-descartes). If the test continues to > pass then it means it’s not killing the mutant and thus its mutation score > decreases. > > So in short: > * Jacoco/Clover: measure how much of the code is tested > * Pitest/Descartes: measure how good the tests are > > Both provide a percentage value. > > I’m proposing to compute the current mutation scores for xwiki-commons and > xwiki-rendering and fail the build when new code is added that reduce the > mutation score threshold (exactly the same as our jacoco threshold and > strategy). > > I consider this is an experiment to push the limit of software engineering a > bit further. I don’t know how well it’ll work or not. I propose to do the > work and test this for over 2-3 months and see how well it works or not. At > that time we can then decide whether it works or not (i.e whether the gains > it brings are more important than the problems it causes). > > Here’s my +1 to try this out. > > Some links: > * pitest: http://pitest.org/ > * descartes: https://github.com/STAMP-project/pitest-descartes > * http://massol.myxwiki.org/xwiki/bin/view/Blog/ControllingTestQuality > * http://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes > > If you’re curious, you can see a screenshot of a mutation score report at > http://massol.myxwiki.org/xwiki/bin/download/Blog/MutationTestingDescartes/report.png > > Please cast your votes. > > Thanks > -Vincent

