TLDR: I'd like to propose adding a dependency on a modern unit testing 
framework to make it easier to write unit tests within GCC. Before I spend much 
more time on it, what sort of buy-in should I get? Are there any people in 
particular I should work more closely with as I make this change?
 
Terminology: Within GCC, there are two types of tests in place: unit tests and 
regression tests. The unit tests have been written with a home-grown selftest 
framework and run as part of the build process. Any failures to a unit test 
results in no compiler being produced. The regression tests, on the other hand, 
run after build, and use the separate DejaGnu framework. In this email, I am 
only concerning myself with the unit tests, and throughout the remainder of the 
email, any mention of tests refers to these.
 
Working on GCC, I wanted to add some new unit tests to my feature as I went, 
but I noticed that there is a good deal of friction involved. Right now, adding 
new unit tests requires writing the test method, then modifying a second place 
in the code to call said test method, repeating as necessary until getting all 
the way to either the selftest.c file or the target hook. There is also no way 
to do test setup/teardown automatically. Everything is manual.
 
I'd like to propose adding a dependency on a modern open-source unit testing 
framework as an enhancement to the current self test system. I have used Catch2 
(https://github.com/catchorg/Catch2, Boost Software License 1.0) with great 
success in the past. I experimented with adding it to GCC and converting a 
handful of tests to use Catch2. Although I only converted a small number of 
tests, I didn't see any performance impact during selftest. As a bonus, while 
doing so, I actually found that one test that I had written previously wasn't 
actually being run, because I had failed to manually call it.
 
Some nice things that Catch2 provides are better error reporting (see below for 
a comparison), ease of adding new tests (just include the header and write a 
TEST_CASE(), as opposed to the manual plumbing required right now), extension 
points for adding custom comparisons (I could see this being very useful to 
expand on the current rtl test macros), and the ability to run a subset of the 
tests without recompiling. It is also easy to integrate Catch2 with the 
existing self-test framework.
 
If this path seems useful to others, I'm happy to pursue it further. A list of 
work items I see are:
 
1. Convert more tests to verify the claim that build performance is not degraded
2. Update the docs to list Catch2 as the new recommended way to write unit tests
3. If all of the target self-tests are converted, then we can remove the target 
test hook. Similar for the lang test hook.
 
One thing that would make Catch2 an even more slam-dunk case was if we were 
able to enable exceptions for the check builds. Then, running the unit tests 
could report multiple failures at the same time instead of just aborting at the 
first one. That said, even without enabling exceptions, Catch2 is on par with 
the current selftest in terms of terminating at the first failure.
 
Another option is to use a test framework that doesn't use exceptions, such as 
Google Test (https://github.com/google/googletest, BSD 3-Clause "New" or 
"Revised" License). I personally think Catch2 is more flexible, or I would lead 
with Google Test. For example, in Catch2, shared setup is done in place with 
the tests itself, having each subtest be a nested SECTION, where-as in GTest, 
you have to write a test class that derives from ::test and overrides SetUp(). 
In addition, the sections in Catch2 can be nested further, allowing several 
related tests to build on each other. 
 
Here is some sample output for the case where all the tests are passing:
===============================================================================
All tests passed (25 assertions in 5 test cases)
 
And here is the output when a test fails:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
is a Catch v2.9.2 host application.
Run with -? for options
 
-------------------------------------------------------------------------------
test_set_range
-------------------------------------------------------------------------------
../../gcc/bitmap.c:2661
...............................................................................
../../gcc/bitmap.c:2668: FAILED:
  REQUIRE( 6 == bitmap_count_bits (b) )
with expansion:
  6 == 5
 
Catch will terminate because it needed to throw an exception.
The message was: Test failure requires aborting test!
terminate called without an active exception
../../gcc/bitmap.c:2668: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal
===============================================================================
test cases: 2 | 1 passed | 1 failed
assertions: 5 | 3 passed | 2 failed
cc1: internal compiler error: Aborted
<long callstack>
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
 
(Note that at the moment it doesn't know the name of our application or it 
would have prefixed "is a Catch..." with our app name).
 
Compare that to the output of the current test framework:
../../gcc/bitmap.c:2669: test_set_range: FAIL: ASSERT_EQ ((6), 
(bitmap_count_bits (b)))
cc1: internal compiler error: in fail, at selftest.c:47
/bin/bash ../../gcc/../move-if-change tmp-macro_list macro_list
echo timestamp > s-macro_list
<long callstack>
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
 
Thanks,
 
Andrew

Reply via email to