On Thursday, 4 August 2016 at 10:24:39 UTC, Walter Bright wrote:
On 8/4/2016 1:13 AM, Atila Neves wrote:
On Thursday, 28 July 2016 at 23:14:42 UTC, Walter Bright wrote:
On 7/28/2016 3:15 AM, Johannes Pfau wrote:
And as a philosophical question: Is code coverage in unittests even a
meaningful measurement?

Yes. I've read all the arguments against code coverage testing. But in my usage of it for 30 years, it has been a dramatic and unqualified success in
improving the reliability of shipping code.

Have you read this?

http://www.linozemtseva.com/research/2014/icse/coverage/

I've seen the reddit discussion of it. I don't really understand from reading the paper how they arrived at their test suites, but I suspect that may have a lot to do with the poor correlations they produced.

I think I read the paper around a year ago, my memory is fuzzy. From what I remember they analysed existing test suites. What I do remember is having the impression that it was done well.

Unittests have uncovered lots of bugs for me, and code that was unittested had far, far fewer bugs showing up after release. <snip>

No argument there, as far as I'm concerned, unit tests = good thing (TM).

It think measuring unit test code coverage is a good idea, but only so it can be looked at to find lines that really should have been covered but weren't. What I take issue with is two things:

1. Code coverage metric targets (especially if the target is 100%). This leads to inane behaviours such as "testing" a print function (which itself was only used in testing) to meet the target. It's busywork that accomplishes nothing.

2. Using the code coverage numbers as a measure of unit test quality. This was always obviously wrong to me, I was glad that the research I linked to confirmed my opinion, and as far as I know (I'd be glad to be proven wrong), nobody else has published anything to convince me otherwise.

Code coverage, as a measure of test quality, is fundamentally broken. It measures coupling between the production code and the tests, which is never a good idea. Consider:

int div(int i, int j) { return i + j; }
unittest { div(3, 2); }

100% coverage, utterly wrong. Fine, no asserts is "cheating":

int div(int i, int j) { return i / j; }
unittest { assert(div(4, 2) == 2); }

100% coverage. No check for division by 0. Oops.

This is obviously a silly example, but the main idea is: coverage doesn't measure the quality of the sentinel values. Bad tests serve only as sanity tests, and the only way I've seen so far to make sure the tests themselves are good is mutant testing.



Atila

Reply via email to