On 6/16/07, Chris Dolan <[EMAIL PROTECTED]> wrote:
Josh,
Josh, can you explain to us in a little more depth what this means?
Are you showing that certain input values follow the same path
through the code?
Yes.
It looks like the full path through the code is
the key to your hash of runs. If I've understood that part
correctly, then I'm still having trouble understanding where you go
from there. How is that a measure of code coverage? Are you
planning to then compare those paths to the full tree of opcodes?
Ah, sorry. I have a problem at work where there's a pile of nasty code
which deals in really gigantic piles of input per day. Much of the
data is similar and I assume triggers identical paths through the
code. If I wanted to get "full" coverage I could probably use a
suitably gigantic pile of input because that would probably be
sufficient to cover all or many of the potential code paths.
It is impractical to write tests using those monstrously large piles
of input. I want to filter out input which triggers redundant code
paths and retain only those data that cause a unique behaviour.
My example was an evenness detector. In one possible scenario I figure
"several thousand" numbers are sufficient to cover all the possible
code paths because "I don't understand this function." Maybe I really
have hundreds of millions of numbers being tested and I really want to
know if I get any additional behaviors by trying 4 if I've already
tried 2.
What I learn from using this is that I can get equivalent coverage
from both inputs 2 and 4. I can now opt to write my tests using just
the number 2 because I don't learn anything new by using any
additional numbers like 4, 6, or 8.
So I can test for which inputs will add to my code coverage and which will not.
Josh