Re: [rspec-users] Spec run heuristics (Re: Cucover: coverage-aware 'lazy' cucumber runs)

aslak hellesoy Mon, 13 Apr 2009 04:54:02 -0700

On Sun, Apr 12, 2009 at 6:47 AM, Stephen Eley <sfe...@gmail.com> wrote:


> On Sat, Apr 11, 2009 at 2:02 PM, Ashley Moran
> <ashley.mo...@patchspace.co.uk> wrote:
> >
> > I was just idly thinking, could a code-coverage based system could be
> > combined with some sort of failure (fragility) history to balance the
> time
> > cost of heavy feature runs with the benefits of having something run
> > end-to-end?  We've had reverse-modification-time spec ordering for ages
> > which is a useful start.
>
> I've had it in my head for a while now that someday (yes, that
> mythical 'someday') I want to write a better autotest.  Maybe this is
> heresy, but I am a huge fan of the _idea_ behind autotest and totally
> annoyed by its implementation.  It's extensible only in strange ways
> (hence wrappers like autospec), and its fundamental strategy is too
> static.  I once lost most of a day trying to fix merb_cucumber so the
> features would run when they should, and was ready to hurl cats when I
> realized autotest's idea of context chaining was to make you list them
> all in the classname in alphabetical order.  Look at the files in the
> Cucumber gem's 'lib/autotest' directory and you'll see what I mean.
>
> A proper design would let you plug in your own file-change discovery
> strategy, plug in multiple runners (RSpec, Cucumber, yadda yadda) with
> true modularity, specify lists of observers on directories or files,
> and allow different output views.  An _ideal_ design would also let
> you set priority rules like you're describing here, so you get instant
> feedback only on the stuff you're working with, and do end-to-end runs
> in the background.
>

A couple of years ago I was on a project that had fallen into the trap of
too many integration tests (exactly the horror scenario that J.B.
Rainsberger describes: http://www.jbrains.ca/permalink/239). The whole suite
had hundreds of slow watir tests and took several hours to run. If there was
a failure, it was usually just in a coupel of them.

We ended up improving this a lot with a homegrown distributed RSpec runner
(based on Drb) and employing techniques from pairwise testing (
http://www.pairwise.org/).

At the time I also explored a third technique that we never put in use:
Heuristics.

If we could establish relationships between arbitrary files and failing
tests over time, then we would be able to calulate the probablilty that a
certain commit would break certain tests. We could then choose to run the
tests that had a high probablilty of breakage, and exclude the others. A
neural network could potentially be used to implement this. The input
neurons would be files in the codebase (on if it's changed, off if not) and
the output neurons would be the tests to run.

So if someone develops a better AutoTest with a plugin architecture, and
that doesn't have to run as a long/lived process, then I'd be very
interested in writing the neural network part - possibly backed by FANN (
http://leenissen.dk/fann/)

It's so crazy it has to be tried!

Aslak


>
> Right now this is just a pipe dream, but I don't think it would be
> _hard._  It's just finding the time to do it vs. actual public-facing
> applications that's the challenge.  If anybody wants to have a
> conversation about this, maybe get some collaboration going, feel free
> to drop me a line.
>
>
> > On a more ranty note - I have very little time for these "XXX
> > BDD/development technique is always bad, don't do it" articles.  (But
> hey,
> > maybe I was guilty of this myself and have forgotten since...)
>
> "Declaring absolutes is always bad, don't do it?"  >8->
>
> Oh -- one other thought I had from reflecting upon your e-mail.  This
> is totally unrelated to the above, but since we're being Big Thinkers
> I might as well write it down before I forget.  You mentioned
> fragility/failure history, in relation to coverage, and I started
> thinking...  I wonder if everyone's going about test coverage from the
> wrong direction, simply trying to _anticipate_ failure?  What if we
> extended something like exception_notifier or Hoptoad as well, and
> brought real exceptions from the application's staging and production
> environments into our test tools?  We know from the stack traces where
> failures occur, so it'd be pretty straightforward to write an
> RCov-like utility that nagged you: "You dingbat, your specs totally
> fail to cover line 119 of hamburgers.rb.  It threw a MeatNotFound
> exception last Tuesday.  Gonna test for that?  Ever?"
>
> What do you think?  Decent idea?  Or does something like this already
> exist and I don't know about it?
>
>
> --
> Have Fun,
>   Steve Eley (sfe...@gmail.com)
>   ESCAPE POD - The Science Fiction Podcast Magazine
>   http://www.escapepod.org
>  _______________________________________________
> rspec-users mailing list
> rspec-users@rubyforge.org
> http://rubyforge.org/mailman/listinfo/rspec-users
>

_______________________________________________
rspec-users mailing list
rspec-users@rubyforge.org
http://rubyforge.org/mailman/listinfo/rspec-users

Re: [rspec-users] Spec run heuristics (Re: Cucover: coverage-aware 'lazy' cucumber runs)

Reply via email to