While the basic assert based tests are good enough for me, I do wish that 
the test framework could be more flexible. Some of this is historic - we 
started out not wanting a separate set of unit vs. comprehensive test 
suites. The goal with the unit tests was to have something that could be 
easily tested rapidly during development and catch regressions in the basic 
system. This evolved into something more than what it was intended to do. 
We even added some very basic perf tests to this framework.

I find myself wanting a few more things from it as I have worked on the ARM 
port on and off. Some thoughts follow.

I'd love to be able to run the entire test suite, knowing how many tests 
there are in all, how many pass and how many fail. Over time, it is nice to 
know how the total number of tests has increased along with the code in 
base. Currently, on ARM, tons of stuff fails and I run all the tests by 
looping over all the test files, and they all give up after the first 
failure.

If I had, say, the serial number of the failing cases, I can keep 
repeatedly testing just those as I try to fix a particular issue. 
Currently, the level of granularity is a whole test file.

Documentation of the test framework in the manual has been on my mind. We 
have it in the standard library documentation, but not in the manual. This 
has been on my mind for a while.

Code coverage is essential - but that has already been discussed in detail 
in this thread, and some good work has already started.

Beyond basic correctness testing, numerical codes need to also have tests 
for ill-conditioned inputs. For the most part, we depend on our libraries 
to be well-tested (LAPACK, FFTW, etc.), but increasingly, we are writing 
our own libraries. Certainly package authors are pushing boundaries here.

A better perf test framework would also be great to have. Ideally, the perf 
test coverage would cover everything, and also have the ability to compare 
against performance in the past. Elliot's Codespeed was meant to do this, 
but somehow it hasn't worked out yet. I am quite hopeful that we will 
figure it out.

Stuff like QuickCheck that generate random test cases are useful, but I am 
not convinced that should be in Base.

-viral

On Tuesday, December 30, 2014 3:35:27 AM UTC+5:30, Jameson wrote:
>
> I imagine there are advantages to frameworks in that you can expected 
> failures and continue through the test suite after one fails, to give a 
> better % success/failure metric than Julia's simplistic go/no-go approach.
>
> I used JUnit many years ago for a high school class, and found that, 
> relative to `@assert` statements, it had more options for asserting various 
> approximate and conditional statements that would otherwise have been very 
> verbose to write in Java. Browsing back through it's website now (
> http://junit.org/ under Usage and Idioms), it apparently now has some 
> more features for testing such as rules, theories, timeouts, and 
> concurrency). Those features would likely help improve testing coverage by 
> making tests easier to describe.
>
> On Mon Dec 29 2014 at 4:45:53 PM Steven G. Johnson <stevenj....@gmail.com> 
> wrote:
>
>> On Monday, December 29, 2014 4:12:36 PM UTC-5, Stefan Karpinski wrote: 
>>
>>> I didn't read through the broken builds post in detail – thanks for the 
>>> clarification. Julia basically uses master as a branch for merging and 
>>> simmering experimental work. It seems like many (most?) projects don't do 
>>> this, and instead use master for stable work.
>>>
>>
>> Yeah, a lot of projects use the Gitflow model, in which a develop branch 
>> is used for experimental work and master is used for (nearly) release 
>> candidates. 
>>
>> I can understand where Dan is coming from in terms of finding issues 
>> continually when using Julia, but in my case it's more commonly "this 
>> behavior is annoying / could be improved" than "this behavior is wrong".  
>> It's rare for me to code for a few hours in Julia without filing issues in 
>> the former category, but out of the 300 issues I've filed since 2012, it 
>> looks like less than two dozen are in the latter "definite bug" category.
>>
>> I'm don't understand his perspective on "modern test frameworks" in which 
>> FactCheck is light-years better than a big file full of asserts.  Maybe my 
>> age is showing, but from my perspective FactCheck (and its Midje 
>> antecedent) just gives you a slightly more verbose assert syntax and a way 
>> of grouping asserts into blocks (which doesn't seem much better than just 
>> adding a comment at the top of a group of asserts).   Tastes vary, of 
>> course, but Dan seems to be referring to some dramatic advantage that isn't 
>> a matter of mere spelling.  What am I missing?
>>
>>> 

Reply via email to