Re: How might we mark a test suite isn't parallalizable?

Mark Stosberg Wed, 29 May 2013 07:01:15 -0700

>> We have a cron job that runs overnight to clean up anything that was
>> missed in Jenkin's runs.
> 
> No offense, but that scares me. If this strategy was so successful,
> why do you even need to clean anything up? You can accumulate cruft
> forever, right?


Ha. Like any database, smaller ones perform better.

>> We expect our tests to generally work in the face of a "dirty"
>> database.  If they don't, that's considered a flaw in the test.
> 
> Which implies that you might be unknowingly relying on something a
> previous test did, a problem I've repeatedly encountered in poorly
> designed test suites.

I just ran across a ticket now where our design was helpful, so I
thought I would share it. The story goes like this:

I was tracking down a test that sometimes failed. I found that the test
was expecting a "no results" state on the page, but sometimes other test
data created a "some results" state. The test failed at this point
because there was actually a bug in a code. This bug was not found by
the site's users, our client, ou development time, or directly the
automated test itself, but only because the the environment it run it
had some randomness in it.

Often I do intentionally create isolation among my tests, and sometimes
we have cases like this, which add value for us.

> Your tests run against a different test database per pid.
> 
> Or you run them against multiple remote databases with
> TAP::Harness::Remote or TAP::Harness::Remote::EC2.
> 
> Or you run them single-threaded in a single process instead of
> multiple processes.
> 
> Or maybe profiling exposes issues that weren't previously apparent.
> 
> Or you fall back on a truncating strategy instead of rebuilding
> (http://www.slideshare.net/Ovid/turbo-charged-test-suites-presentation).
> That's often a lot faster.
> 
> There are so many ways of attacking this problem which don't involve
> trying to debug an unknown, non-deterministic state.

And if I'm running 4 tests files in parallel, would you expect that I'm
setting up and tearing down a database with 50 tables and significant
database for each and every test file, so they don't interfere with each
other?

That seems rather wasteful, when we already have easy to implementation
solutions to allow multiple tests to share the same database while still
achieving isolation between them when needed.

> I'll be honest, I've been doing testing for a long, long time and this
> is the first time that I can recall anyone arguing for an approach
> like this. I'm not saying you're wrong, but you'll have to do a lot of
> work to convince people that starting out with an effectively random
> environment is a good way to test code.

I've been doing Perl testing for about 15 years myself. Many of our
tests have isolation designed in, because there's value in that.

There's also value against running some tests against ever-changing
datasets, which is more like the kind of data that actually happens in
production.

    Mark

Re: How might we mark a test suite isn't parallalizable?

Reply via email to