On Tue, 2013-11-05 at 14:18 -0700, Jeff Law wrote: > On 10/31/13 10:26, David Malcolm wrote: > > The gimple statement types are currently implemented using a hand-coded > > C inheritance scheme, with a "union gimple_statement_d" holding the > > various possible structs for a statement. > > > > The following series of patches convert it to a C++ hierarchy, using the > > existing structs, eliminating the union. The "gimple" typedef changes > > from being a > > (union gimple_statement_d *) > > to being a: > > (struct gimple_statement_base *) > > > > There are no virtual functions in the new code: the sizes of the various > > structs are unchanged. > > > > It makes use of "is-a.h", using the as_a <T> template function to > > perform downcasts, which are checked (via gcc_checking_assert) in an > > ENABLE_CHECKING build, and are simple casts in an unchecked build, > > albeit it in an inlined function rather than a macro. > > > > For example, one can write: > > > > gimple_statement_phi *phi = > > as_a <gimple_statement_phi> (gsi_stmt (gsi)); > > > > and then directly access the fields of the phi, as a phi. The existing > > accessor functions in gimple.h become somewhat redundant in this > > scheme, but are preserved. > > > > The earlier versions of the patches made all of the types GTY((user)) > > and provided hand-written implementations of the gc and pch marker > > routines. In this new version we rely on the support for simple > > inheritance that I recently added to gengtype, by adding a "desc" > > to the GTY marking for the base class, and a "tag" to the marking > > for all of the concrete subclasses. (I say "class", but all the types > > remain structs since their fields are all publicly accessible). > > > > As noted in the earlier patch, I believe this is a superior scheme to > > the C implementation: > > > > * We can get closer to compile-time type-safety, checking the gimple > > code once and downcasting with an as_a, then directly accessing > > fields, rather than going through accessor functions that check > > each time. In some places we may want to replace a "gimple" with > > a subclass e.g. phis are always of the phi subclass, to get full > > compile-time type-safety. > > > > * This scheme is likely to be easier for newbies to understand. > > > > * Currently in gdb, dereferencing a gimple leads to screenfuls of text, > > showing all the various union values. With this, you get just the base > > class, and can cast it to the appropriate subclass. > > > > * With this, we're working directly with the language constructs, > > rather than rolling our own, and thus other tools can better > > understand the code. (e.g. doxygen). > > > > Again, as noted in the earlier patch series, the names of the structs > > are rather verbose. I would prefer to also rename them all to eliminate > > the "_statement" component: > > "gimple_statement_base" -> "gimple_base" > > "gimple_statement_phi" -> "gimple_phi" > > "gimple_statement_omp" -> "gimple_omp" > > etc, but I didn't do this to mimimize the patch size. But if the core > > maintainers are up for that, I can redo the patch series with that > > change also, or do that as a followup. > > > > The patch is in 6 parts; all of them are needed together. > And that's part of the problem. There's understandable resistance to > (for example) the as_a casting. > > There's a bit of natural tension between the desire to keep patches > small and self-contained and the size/scope of the changes necessary to > do any serious reorganization work. This set swings too far in the > latter direction :-) > > Is there any way to go forward without the is_a/as_a stuff? ie, is > there're a simpler step towards where we're trying to go that allows > most of this to go forward now rather than waiting? > > > > > * Patch 1 of 6: This patch adds inheritance to the various gimple > > types, eliminating the initial baseclass fields, and eliminating the > > union gimple_statement_d. All the types remain structs. They > > become marked with GTY(()), gaining GSS_ tag values. > > > > * Patch 2 of 6: This patch ports various accessor functions within > > gimple.h to the new scheme. > > > > * Patch 3 of 6: This patch is autogenerated by "refactor_gimple.py" > > from https://github.com/davidmalcolm/gcc-refactoring-scripts > > There is a test suite "test_refactor_gimple.py" which may give a > > clearer idea of the changes that the script makes (and add > > confidence that it's doing the right thing). > > The patch converts code of the form: > > { > > GIMPLE_CHECK (gs, SOME_CODE); > > gimple_subclass_get/set_some_field (gs, value); > > } > > to code of this form: > > { > > some_subclass *stmt = as_a <some_subclass> (gs); > > stmt->some_field = value; > > } > > It also autogenerates specializations of > > is_a_helper <T>::test > > equivalent to a GIMPLE_CHECK() for use by is_a and as_a. > Conceptually I'm fine with #1-#3. > > > > > * Patch 4 of 6: This patch implement further specializations of > > is_a_helper <T>::test, for gimple_has_ops and gimple_has_mem_ops. > Here's where I start to get more concerned.
Thanks for looking through this. Both you and Andrew objected to my use of the is-a.h stuff. Is this due to the use of C++ templates in that code? If I were to rewrite things in a more C idiom, would that be acceptable? For instance, rather than, say: p = as_a <gimple_statement_asm> ( gimple_build_with_ops (GIMPLE_ASM, ERROR_MARK, ninputs + noutputs + nclobbers + nlabels)); we could have an inlined as_a equivalent in C syntax: p = gimple_as_a_gimple_asm ( gimple_build_with_ops (GIMPLE_ASM, ERROR_MARK, ninputs + noutputs + nclobbers + nlabels)); where there could be, say, a pair of functions like this (to handle const vs non-const): inline gimple_asm gimple_as_a_gimple_asm (gimple gs) { GIMPLE_CHECK (gs->code == GIMPLE_ASM); return (gimple_asm)gs; } inline const_gimple_asm gimple_as_a_gimple_asm (const_gimple gs) { GIMPLE_CHECK (gs->code == GIMPLE_ASM); return (const_gimple_asm)gs; } (where typedef gimple_statement_asm *gimple_asm) That would avoid template usage within the patch, leaving the use of C++ inheritance as the only overtly C++ish aspect. We could do the above using preprocessor magic, but I'd prefer to have actual code to do it. Similarly, instead of: const gimple_statement_with_ops *ops_stmt = dyn_cast <const gimple_statement_with_ops> (g); if (!ops_stmt) return NULL; we could have: const_gimple_with_ops ops_stmt = gimple_dyn_cast_gimple_with_ops (g); if (!ops_stmt) return NULL; > > * Patch 5 of 6: This patch does the rest of porting from union access > > to subclass access (all the fiddly places that the script in patch 3 > > couldn't handle). > > > > * Patch 6 of 6: This patch updates the gdb python pretty-printing > > hook. > Conceptually #5 and #6 shouldn't be terribly controversial. (...though they're implicitly using the template specializations from #3 and #4) > THe question is can we move forward without patch #4, even if that means > we aren't getting the static typechecking we want? Maybe. If the above idea is still too far, we could keep the GIMPLE_CHECK checking, and cast by hand. I suspect the results would be more ugly (though it's clear that beauty is in the eye of the beholder here :)) BTW, how do you feel about static_cast<> vs C-style casts? Thanks Dave