> On 22 May 2018, at 14:10, Vincent Massol <vinc...@massol.net> wrote:
>
> Note: Now that this is done for xwiki-commons, I’d like to also do that for
> xwiki-rendering when I get the time.
It’s now done for xwiki-rendering too.
Thanks
-Vincent
>
> Thanks
> -Vincent
>
>> On 4 Apr 2018, at 16:14, Vincent Massol <vinc...@massol.net> wrote:
>>
>> FYI I’ve now committed this on issue
>> https://jira.xwiki.org/browse/XCOMMONS-1385 for xwiki-commons.
>>
>> And I’ve created an adhoc job at
>> http://ci.xwiki.org/job/xwiki-commons_pitest/ which executes PIT/Descartes
>> (to be moved to our Jenkins pipeline later on).
>>
>> Let’s now make sure we fix the build when this job breaks and verify if the
>> strategy works or not!
>>
>> Thanks
>> -Vincent
>>
>>
>>
>>> On 15 Mar 2018, at 09:30, Vincent Massol <vinc...@massol.net> wrote:
>>>
>>> Hi devs,
>>>
>>> As part of the STAMP research project, we’ve developed a new tool
>>> (Descartes, based on Pitest) to measure the quality of tests. It generates
>>> a mutation score for your tests, defining how good the tests are. Technical
>>> Descartes performs some extreme mutations on the code under test (e.g.
>>> remove content of void methods, return true for methods returning a
>>> boolean, etc - See https://github.com/STAMP-project/pitest-descartes). If
>>> the test continues to pass then it means it’s not killing the mutant and
>>> thus its mutation score decreases.
>>>
>>> So in short:
>>> * Jacoco/Clover: measure how much of the code is tested
>>> * Pitest/Descartes: measure how good the tests are
>>>
>>> Both provide a percentage value.
>>>
>>> I’m proposing to compute the current mutation scores for xwiki-commons and
>>> xwiki-rendering and fail the build when new code is added that reduce the
>>> mutation score threshold (exactly the same as our jacoco threshold and
>>> strategy).
>>>
>>> I consider this is an experiment to push the limit of software engineering
>>> a bit further. I don’t know how well it’ll work or not. I propose to do the
>>> work and test this for over 2-3 months and see how well it works or not. At
>>> that time we can then decide whether it works or not (i.e whether the gains
>>> it brings are more important than the problems it causes).
>>>
>>> Here’s my +1 to try this out.
>>>
>>> Some links:
>>> * pitest: http://pitest.org/
>>> * descartes: https://github.com/STAMP-project/pitest-descartes
>>> * http://massol.myxwiki.org/xwiki/bin/view/Blog/ControllingTestQuality
>>> * http://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
>>>
>>> If you’re curious, you can see a screenshot of a mutation score report at
>>> http://massol.myxwiki.org/xwiki/bin/download/Blog/MutationTestingDescartes/report.png
>>>
>>> Please cast your votes.
>>>
>>> Thanks
>>> -Vincent
>>
>