This patch kit introduces an RTL frontend, for the purpose of unit testing: primarly for unit testing of RTL passes, and possibly for unit testing of .md files.
It's very much a work-in-progress; I'm posting it now to get feedback. I've successfully bootstrapped®rtested patches 1-3 of the kit on x86_64-pc-linux-gnu, but patch 4 (which is the heart of the implementation) doesn't survive bootstrap yet (dependency issues in the Makefile). The rest of this post is from gcc/rtl/notes.rst from patch 4; I'm adding a duplicate copy up-front here to make it easier to get an overview. RTL frontend ============ Purpose ******* Historically GCC testing has been done by providing source files to be built with various command-line options (via DejaGnu directives), dumping state at pertinent places, and verifying properties of the state via these dumps. A strength of this approach is that we have excellent integration testing, as every test case exercises the toolchain as a whole, but it has the drawback that when testing a specific pass, we have little control of the input to that specific pass. We provide input, and the various passes transform the state of the internal representation:: INPUT -> PASS-1 -> STATE-1 -> PASS-2 -> STATE-2 -> ... -> etc -> -> ... -> PASS-n-1 -> STATE-n-1 -> PASS-n -> STATE-n ^ ^ ^ | | Output from the pass | The pass we care about The actual input to the pass so the intervening passes before "PASS-n" could make changes to the IR that affect the input seen by our pass ("STATE-n-1" above). This can break our test cases, sometimes in a form that's visible, sometimes invisibly (e.g. where a test case silently stops providing coverage). The aim of the RTL frontend is to provide a convenient way to test individual passes in the backend, by loading dumps of specific RTL state (possibly edited by hand), and then running just one specific pass on them, so that we effectively have this:: INPUT -> PASS-n -> OUTPUT thus fixing the problem above. My hope is that this makes it easy to write more fine-grained and robust test coverage for the RTL phase of GCC. However I see this as *complementary* to the existing "integrated testing" approach: patches should include both RTL frontend tests *and* integrated tests, to avoid regressing the great integration testing we currently have. The idea is to use the existing dump format as a input format, since presumably existing GCC developers are very familiar with the dump format. One other potential benefit of this approach is to allow unit-testing of machine descriptions - we could provide specific RTL fragments, and have the rtl.dg testsuite directly verify that we recognize all instructions and addressing modes that a given target ought to support. Structure ********* The RTL frontend is similar to a regular frontend: a gcc/rtl subdirectory within the source tree contains frontend-specific hooks. These provide a new "rtl" frontend, which can be optionally enabled at configuration time within --enable-languages. If enabled, it builds an rtl1 binary, which is invoked by the gcc driver on files with a .rtl extension. The testsuite is below gcc/testsuite/rtl.dg. There's also a "roundtrip" subdirectory below this, in which every .rtl file is loaded and then dumped; roundtrip.exp verifies that the dump is identical to the original file, thus ensuring that the RTL loaders faithfully rebuild the input dump. Limitations *********** * It's a work-in-progress. There will be bugs. * The existing RTL code is structured around a single function being optimized, so, as a simplification, the RTL frontend can only handle one function per input file. Also, the dump format currently uses comments to separate functions:: ;; Function test_1 (test_1, funcdef_no=0, decl_uid=1758, cgraph_uid=0, symbol_order=0) ... various pass-specific things, sometimes expressed as comments, sometimes not ;; ;; Full RTL generated for this function: ;; (note 1 0 6 NOTE_INSN_DELETED) ;; etc, insns for function "test_1" go here (insn 27 26 0 6 (use (reg/i:SI 0 ax)) ../../src/gcc/testsuite/rtl.dg/test.c:7 -1 (nil)) ;; Function test_2 (test_2, funcdef_no=1, decl_uid=1765, cgraph_uid=1, symbol_order=1) ... various pass-specific things, sometimes expressed as comments, sometimes not ;; ;; Full RTL generated for this function: ;; (note 1 0 5 NOTE_INSN_DELETED) ;; etc, insns for function "test_2" go here (insn 59 58 0 8 (use (reg/i:SF 21 xmm0)) ../../src/gcc/testsuite/rtl.dg/test.c:31 -1 (nil)) so that there's no clear separation of the instructions between the two functions (and no metadata e.g. function names). This could be fixed by adding a new clause to the dump e.g.:: (function "test_1" [ (note 1 0 6 NOTE_INSN_DELETED) ;; etc, insns for function "test_1" go here (insn 27 26 0 6 (use (reg/i:SI 0 ax)) ../../src/gcc/testsuite/rtl.dg/test.c:7 -1 (nil)) ]) (function "test_2" [ (note 1 0 5 NOTE_INSN_DELETED) ;; etc, insns for function "test_2" go here (insn 59 58 0 8 (use (reg/i:SF 21 xmm0)) ../../src/gcc/testsuite/rtl.dg/test.c:31 -1 (nil)) ]) or somesuch (this wouldn't be an rtx code, just something in rtl-frontend.c). The RTL frontend could then compile each function in turn after parsing each one (probably the easiest way to deal with the global state in the RTL parts of the compiler). * The RTL frontend doesn't have any knowledge of the name of the function, of parameters, types, locals, globals, etc. It creates a single function. The function is currently hardcoded to have this signature: int test_1 (int, int, int); since there's no syntax for specify otherwise, and we need to provide a FUNCTION_DECL tree when building a function object (by calling allocate_struct_function). * Similarly, there are no types beyond the built-in ones; all expressions are treated as being of type int. I suspect that this approach will be too simplistic when it comes to e.g. aliasing. * There's no support for running more than one pass; fixing this would require being able to run passes from a certain point onwards. * Roundtripping of recognized instructions may be an issue (i.e. those with INSN_CODE != -1), such as the "667 {jump}" in the following:: (jump_insn 50 49 51 10 (set (pc) (label_ref:DI 59)) ../../src/test-switch.c:18 667 {jump} (nil) -> 59) since the integer ID can change when the .md files are changed (and the associated pattern name is very much target-specific). It may be best to reset them to -1 in the input files (and delete the operation name), giving:: (jump_insn 50 49 51 10 (set (pc) (label_ref:DI 59)) ../../src/test-switch.c:18 -1 (nil) -> 59) * Currently there's no explicit CFG edge information in the dumps. The rtl1 frontend reconstructs the edges based on jump instructions. As I understand the distinction between cfgrtl and cfglayout modes https://gcc.gnu.org/wiki/cfglayout_mode , this is OK for "cfgrtl" mode, but isn't going to work for "cfglayout" mode - in the latter, unconditional jumps are represented purely by edges in the CFG, and this information isn't currently present in the dumps (perhaps we could add it if it's an issue). Open Questions ************** * Register numbering: consider this fragment of RTL emitted during expansion:: (reg/f:DI 82 virtual-stack-vars) At the time of emission, register 82 is the VIRTUAL_STACK_VARS_REGNUM, and this value is effectively hardcoded into the dump. Presumably this is baking in assumptions about the target into the test. Also, how likely is this value to change? When we reload the dump, should we notice that this is tagged with virtual-stack-vars and override the specific register number to use the current value of VIRTUAL_STACK_VARS_REGNUM on the target rtl1 was built for? TODO items ********** * test with other architectures * roundtrip.exp: strip out comments in source when comparing roundtrip * example with "-g" * implement a fuzzer (or use AFL on the existing test cases) Thoughts? Hope this looks useful Dave David Malcolm (4): Make argv const char ** in read_md_files etc Move name_to_pass_map into class pass_manager Extract deferred-location handling from jit Initial version of RTL frontend gcc/Makefile.in | 1 + gcc/cfgexpand.c | 7 +- gcc/deferred-locations.c | 240 ++++ gcc/deferred-locations.h | 139 +++ gcc/emit-rtl.c | 15 +- gcc/emit-rtl.h | 2 + gcc/errors.c | 4 +- gcc/errors.h | 13 + gcc/function.c | 41 +- gcc/function.h | 2 +- gcc/gcc.c | 1 + gcc/genattr-common.c | 2 +- gcc/genattr.c | 2 +- gcc/genattrtab.c | 2 +- gcc/genautomata.c | 4 +- gcc/gencodes.c | 2 +- gcc/genconditions.c | 2 +- gcc/genconfig.c | 2 +- gcc/genconstants.c | 5 +- gcc/genemit.c | 2 +- gcc/genenums.c | 5 +- gcc/genextract.c | 2 +- gcc/genflags.c | 2 +- gcc/genmddeps.c | 5 +- gcc/genopinit.c | 2 +- gcc/genoutput.c | 4 +- gcc/genpeep.c | 4 +- gcc/genpreds.c | 11 +- gcc/genrecog.c | 2 +- gcc/gensupport.c | 33 +- gcc/gensupport.h | 5 +- gcc/gentarget-def.c | 2 +- gcc/jit/jit-common.h | 5 +- gcc/jit/jit-playback.c | 194 +--- gcc/jit/jit-playback.h | 73 +- gcc/pass_manager.h | 6 + gcc/passes.c | 34 +- gcc/print-rtl.c | 4 +- gcc/read-md.c | 338 ++++-- gcc/read-md.h | 158 ++- gcc/read-rtl.c | 670 ++++++++++- gcc/rtl.c | 2 + gcc/rtl.h | 4 + gcc/rtl/Make-lang.in | 148 +++ gcc/rtl/config-lang.in | 36 + gcc/rtl/lang-specs.h | 25 + gcc/rtl/lang.opt | 38 + gcc/rtl/notes.rst | 199 ++++ gcc/rtl/rtl-errors.c | 35 + gcc/rtl/rtl-frontend.c | 1219 ++++++++++++++++++++ gcc/testsuite/lib/rtl-dg.exp | 64 + gcc/testsuite/rtl.dg/dfinit.rtl | 90 ++ gcc/testsuite/rtl.dg/final.rtl | 51 + gcc/testsuite/rtl.dg/good-include.rtl | 6 + gcc/testsuite/rtl.dg/good-includee.md | 1 + gcc/testsuite/rtl.dg/into-cfglayout.rtl | 85 ++ gcc/testsuite/rtl.dg/ira.rtl | 84 ++ gcc/testsuite/rtl.dg/missing-include.rtl | 1 + gcc/testsuite/rtl.dg/pro_and_epilogue.rtl | 38 + gcc/testsuite/rtl.dg/roundtrip/code-labels.rtl | 2 + gcc/testsuite/rtl.dg/roundtrip/frame-pointer.rtl | 4 + gcc/testsuite/rtl.dg/roundtrip/insn-with-mode.rtl | 4 + gcc/testsuite/rtl.dg/roundtrip/jump-to-label.rtl | 6 + gcc/testsuite/rtl.dg/roundtrip/jump-to-return.rtl | 6 + .../rtl.dg/roundtrip/jump-to-simple-return.rtl | 6 + .../rtl.dg/roundtrip/note-insn-basic-block.rtl | 1 + .../rtl.dg/roundtrip/note-insn-deleted.rtl | 1 + .../rtl.dg/roundtrip/reg-with-orig-regno.rtl | 4 + gcc/testsuite/rtl.dg/roundtrip/roundtrip.exp | 99 ++ .../rtl.dg/roundtrip/test-loop.cleaned.rtl | 75 ++ .../rtl.dg/roundtrip/test-switch-after-expand.rtl | 202 ++++ gcc/testsuite/rtl.dg/rtl.exp | 43 + gcc/testsuite/rtl.dg/test.c | 31 + gcc/testsuite/rtl.dg/unknown-insn-uid.rtl | 3 + gcc/testsuite/rtl.dg/unknown-rtx-code.rtl | 1 + gcc/testsuite/rtl.dg/vregs.rtl | 80 ++ gcc/toplev.c | 7 + gcc/tree-dfa.c | 5 + 78 files changed, 4229 insertions(+), 524 deletions(-) create mode 100644 gcc/deferred-locations.c create mode 100644 gcc/deferred-locations.h create mode 100644 gcc/rtl/Make-lang.in create mode 100644 gcc/rtl/config-lang.in create mode 100644 gcc/rtl/lang-specs.h create mode 100644 gcc/rtl/lang.opt create mode 100644 gcc/rtl/notes.rst create mode 100644 gcc/rtl/rtl-errors.c create mode 100644 gcc/rtl/rtl-frontend.c create mode 100644 gcc/testsuite/lib/rtl-dg.exp create mode 100644 gcc/testsuite/rtl.dg/dfinit.rtl create mode 100644 gcc/testsuite/rtl.dg/final.rtl create mode 100644 gcc/testsuite/rtl.dg/good-include.rtl create mode 100644 gcc/testsuite/rtl.dg/good-includee.md create mode 100644 gcc/testsuite/rtl.dg/into-cfglayout.rtl create mode 100644 gcc/testsuite/rtl.dg/ira.rtl create mode 100644 gcc/testsuite/rtl.dg/missing-include.rtl create mode 100644 gcc/testsuite/rtl.dg/pro_and_epilogue.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/code-labels.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/frame-pointer.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/insn-with-mode.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/jump-to-label.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/jump-to-return.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/jump-to-simple-return.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/note-insn-basic-block.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/note-insn-deleted.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/reg-with-orig-regno.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/roundtrip.exp create mode 100644 gcc/testsuite/rtl.dg/roundtrip/test-loop.cleaned.rtl create mode 100644 gcc/testsuite/rtl.dg/roundtrip/test-switch-after-expand.rtl create mode 100644 gcc/testsuite/rtl.dg/rtl.exp create mode 100644 gcc/testsuite/rtl.dg/test.c create mode 100644 gcc/testsuite/rtl.dg/unknown-insn-uid.rtl create mode 100644 gcc/testsuite/rtl.dg/unknown-rtx-code.rtl create mode 100644 gcc/testsuite/rtl.dg/vregs.rtl -- 1.8.5.3