(sorry sent an empty reply by accident) Unit testing is one of the easiest ways to isolate problems in an an internal class, things you can get wrong. But: time spent writing unit tests is time *not* spent writing integration tests. Which biases me towards the integration.
What I do find is good is writing integration tests to debug things: if something is playing up, if you can write a unit test to replicate then not only can you isolate the problem, you can verify it is fixed and stays fixed. And as they are fast & often runnable in parallel, easy to do repetitively. But: Tests have a maintenance cost, especially if the tests go into the internals, making them very brittle to change. Mocking is the real troublespot here. It's good to be able to simulate failures, but given the choice between "integration test against real code" and "something using mocks which produce "impossible' stack traces and, after a code rework, fail so badly you can't tell if it's a regression or just the tests are obsolete", I'd go for production, even if runs up some bills. I really liked Lar's slides; gave me some ideas. One thing I've been exploring is using system metrics in testing, adding more metrics to help note what is happening https://steveloughran.blogspot.co.uk/2016/04/distributed-testing-making-use-of.html Strengths: encourages me to write metrics, can be used in in-VM tests, and collected from a distributed SUT integration tests, both for asserts and logging. Weakness1. : exposing internal state which, again, can be brittle. 2. in integration tests the results can vary a lot, so you can't really make assertions on it. Better there to collect things and use in test reports. Which brings me to a real issue with integration tests, which isn't a fault of the apps or the tests, but in today's test runners: log capture and reporting dates from the era where we were running unit tests, so thinking about the reporting problems there: standard out and error for a single process, no standard log format so naive stream capture over structured log entries; test runners which don't repot much on a failure but the stack trace, or, with scalatest, half the stack trace (*), missing out on those of the remote systems. Systems which, if you are playing with cloud infra, may not be there when you get to analyse the test results. You are left trying to compare 9 logs across 3 destroyed VMs to work out why the test runner through an assertion failure. This is tractable, and indeed, the Kakfa people have been advocating "use kafka as the collector of test results" to address it: the logs, metrics, events raised by the SUT., etc, and then somehow correlate them into test reports, or at least provide the ordering of events and state across parts of the system so that you can work back from a test failure. Yes, that means moving way beyond the usual ant-JUnit XML report everything creates, but like I said: that was written for a different era. It's time to move on, generating the XML report as one of the outputs if you want, but not the one you use for diagnosing why a test fails. I'd love to see what people have been up to in that area. If anyone has insights there, it'd be topic for a hangout. -Steve (*) Scaltest opinions: https://steveloughran.blogspot.co.uk/2016/09/scalatest-thoughts-and-ideas.html