Hi, On Tue, 7 Aug 2007, Ollie Wild wrote:
> In response to a suggestion from Mark Mitchell, I've been attempting to > migrate pointers to members to the GCC middle end. The goal of this is > twofold: (a) to enable conversion of pointer to member dereferences to > direct function calls and member accesses when analysis determines this > is unambiguous and (b) to obsolete the need for the expand_constant > language hook. > > Under my current approach, I've added the following new nodes to > gcc/tree.def: > > DEFTREECODE (PTRMEM_TYPE, "ptrmem_type", tcc_type, 0) > DEFTREECODE (PTRMEM_CST, "ptrmem_cst", tcc_constant, 0) > DEFTREECODE (PTRMEM_PLUS_EXPR, "ptrmem_plus_expr", tcc_binary, 2) > DEFTREECODE (PTRMEM_REF, "ptrmem_ref", tcc_reference, 2) > > I then modify the C++ front end to instantiate the new nodes, expand > them inside expand_expr_real_1 and output_constant, and perform > folding in the various fold-const functions. So those tree expressions would live throughout the middle-end and only then become lowered to RTL directly? I'm not sure that's worthwhile. E.g. I'm not sure why there's a need to really get rid of the expand_constant langhook. It's only important that it isn't called too late, i.e. ideally during gimplification. It seems it only makes use of type information which should be available at that time, so if it currently is called too late (interfering with LTO in the future) it should be possible to move it earlier. I have a conceptual problem with moving pointer to members into the middle-end: my mental model of what the middle-end should be concerned about is complete expressions/constants/types, like adding two numbers, accessing an integer two words away from that address (i.e. you see I already sort of decompose structures in my mental model). Pointers to members is a very different beast: they can't be accessed without a real object, yet they can be stored into objects themself (sort of an incomplete memory reference). If anything they simply resemble offsets (perhaps variable ones), so you might perhaps model them as such. Conversions between them sometimes requires adjustments to 'this', resulting in real operations (the delta field of the struct, how pointer to member values are currently modelled). IMHO it would be wrong if we wouldn't make those adjustments explicit in the middle end. So, why do you think you need the PTRMEM_TYPE in the middle end? And why the PTRMEM_CST (i.e. why couldn't it be lowered to some explicit constant during gimplificaton)? Same for PTRMEM_PLUS_EXPR, why is (PTR_)PLUS_EXPR not enough, if the semantic is only to add the integer argument to the pointer argument (is that even an operation which can be done to pointers to members?)? Also PTRMEM_REF seems to equivalent to a normal COMPONENT_REF, just that the second operand is a funny "offset" specification instead of a simple field decl. > However, pointers to virtual functions are turning out to be > problematic. As far I can tell, the middle end has no concept of > virtual functions and virtual function tables: they appear to be > implemented solely in the C++ front end. This suggests that a migration > of the virtual function machinery is a necessary precondition to pointer > to member migration. Ugh, I wouldn't like that either. I have the feeling that it would drag too much specifics of C++ into the middle end. After all e.g. the virtual tables have to follow a certain layout according to the C++ ABI, which needn't be the right one for other languages. I think you need only one feature, namely given a definite class type and an offset into the vtable, what definite FUNCTION_DECL that corresponds too. I can't think of many places where you'd like to have this information, as the most interesting user of it would be the inliner. There aren't that many transformations which make a former indefinite class type definite, and most of them can be done when the C++ frontend is still around to ask. If you were to implement something like virtual functions into the middle end, it should be expressed in a fairly low level way IMHO. E.g. a virtual table simply being a vector of pointers to function decls (which we can express already just fine). That way they could also be written out for LTO and read back in, and the question what function decl is connected to what slot can also be answered trivially. Then definite class type merely has the characteristic that they can point to such a function table, whereas indefinite class types (i.e. those whose runtime type can be any derived one) can not. E.g. I wouldn't try to model the inheritance relationship. But even with that I don't really see the need for new tree nodes. Pointer to members are a fancy offset, so why not model them as such? It's obviously possible I'm missing something, in that case, please educate me where the problems are ... :-) Ciao, Michael.