Hi folks,

A while ago I started writing a unit testing library called Fact that
was a cross between Haskell's QuickCheck, and Ruby's RSpec. I've done
a lot of work with it, and some people might conceivably be interested
in how I got on with it, so this post is to chart what I found whilst
playing around with Fact.

I had two basic ideas for Fact. The first was to allow tests to be
labelled with a string, rather than with a symbol:

(fact "One plus one equals two" []
  (= (+ 1 1) 2))

This seemed like a good idea, but it turned out to have some problems.
Plain-English descriptions tend to change around more than symbols, so
aren't very good at uniquely identifying an object. To get around
this, I used gensym to create a random symbol to use as an identifier,
but this broke namespace reloading, as each time the file was loaded,
it generated completely new symbols.

So in retrospect, labelling tests with strings turned out to be a bit
more trouble than it was worth.

The second major idea for Fact was to make it simple to generate large
amounts of random test data, and to apply this data to a single
predicate or 'fact':

(fact "x + y > x if x > 0"
  [x (random-int)
   y (random-int)]
  (if (> x 0) (> (+ x y) x)))

The above syntax generates a list of random numbers and applies them
to the bottom predicate. This makes it pretty simple to test
predicates with large amounts of data.

This seemed like an interesting idea, so I used Fact to unit test my
web framework, to see how it would work in practise. It turned out
that it worked quite well for functions that manipulated simple data
structures, like numbers or strings, and less well for functions that
manipulated more complex arrangements of maps and vectors.

The problem with this is that functions that manipulate simple data
tend to be quite easy to test, anyway. A few carefully chosen points
of test data can vastly reduce the probability the function is
incorrect. And when your input data becomes more complex, and thus
more prone to error, creating a function to randomly generate this
complex data becomes difficult enough that I started to find myself
avoiding writing tests for these complex cases.

So I decided to rewrite my unit tests using test-is. Symbols aren't as
descriptive as strings, but they have the advantage of being able to
be reloaded easily, so I could run a Nailgun server for seriously
quick unit tests. Since I wasn't generating all that random data any
more, my tests ran several orders of magnitude faster - down from half
a minute to under half a second. I also found that my tests were
easier to write, with only a few exceptions.

My conclusion is that automatically generated test data is probably
more trouble than it's worth in most cases. I'm going to keep Fact
around in case anyone wants to play with it, but I think I'll wind up
the experiment and mark it down as a lesson learned.

- James
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to