Re: [RFC] Migrate pointers to members to the middle end

Michael Matz Wed, 08 Aug 2007 12:17:40 -0700

Hi,

On Tue, 7 Aug 2007, Ollie Wild wrote:


> In response to a suggestion from Mark Mitchell, I've been attempting to 
> migrate pointers to members to the GCC middle end.  The goal of this is 
> twofold: (a) to enable conversion of pointer to member dereferences to 
> direct function calls and member accesses when analysis determines this 
> is unambiguous and (b) to obsolete the need for the expand_constant 
> language hook.
> 
> Under my current approach, I've added the following new nodes to 
> gcc/tree.def:
> 
>   DEFTREECODE (PTRMEM_TYPE, "ptrmem_type", tcc_type, 0)
>   DEFTREECODE (PTRMEM_CST, "ptrmem_cst", tcc_constant, 0)
>   DEFTREECODE (PTRMEM_PLUS_EXPR, "ptrmem_plus_expr", tcc_binary, 2)
>   DEFTREECODE (PTRMEM_REF, "ptrmem_ref", tcc_reference, 2)
> 
> I then modify the C++ front end to instantiate the new nodes, expand
> them inside expand_expr_real_1 and output_constant, and perform
> folding in the various fold-const functions.

So those tree expressions would live throughout the middle-end and only 
then become lowered to RTL directly?  I'm not sure that's worthwhile.  
E.g. I'm not sure why there's a need to really get rid of the 
expand_constant langhook.  It's only important that it isn't called too 
late, i.e. ideally during gimplification.  It seems it only makes use of 
type information which should be available at that time, so if it 
currently is called too late (interfering with LTO in the future) it 
should be possible to move it earlier.

I have a conceptual problem with moving pointer to members into the 
middle-end: my mental model of what the middle-end should be concerned 
about is complete expressions/constants/types, like adding two numbers, 
accessing an integer two words away from that address (i.e. you see I 
already sort of decompose structures in my mental model).  Pointers to 
members is a very different beast: they can't be accessed without a real 
object, yet they can be stored into objects themself (sort of an 
incomplete memory reference).  If anything they simply resemble offsets 
(perhaps variable ones), so you might perhaps model them as such.  
Conversions between them sometimes requires adjustments to 'this', 
resulting in real operations (the delta field of the struct, how pointer 
to member values are currently modelled).  IMHO it would be wrong if we 
wouldn't make those adjustments explicit in the middle end.

So, why do you think you need the PTRMEM_TYPE in the middle end?  And why 
the PTRMEM_CST (i.e. why couldn't it be lowered to some explicit constant 
during gimplificaton)?  Same for PTRMEM_PLUS_EXPR, why is (PTR_)PLUS_EXPR 
not enough, if the semantic is only to add the integer argument to the 
pointer argument (is that even an operation which can be done to pointers 
to members?)?  Also PTRMEM_REF seems to equivalent to a normal 
COMPONENT_REF, just that the second operand is a funny "offset" 
specification instead of a simple field decl.

> However, pointers to virtual functions are turning out to be 
> problematic.  As far I can tell, the middle end has no concept of 
> virtual functions and virtual function tables: they appear to be 
> implemented solely in the C++ front end.  This suggests that a migration 
> of the virtual function machinery is a necessary precondition to pointer 
> to member migration.

Ugh, I wouldn't like that either.  I have the feeling that it would drag 
too much specifics of C++ into the middle end.  After all e.g. the virtual 
tables have to follow a certain layout according to the C++ ABI, which 
needn't be the right one for other languages.  I think you need only one 
feature, namely given a definite class type and an offset into the 
vtable, what definite FUNCTION_DECL that corresponds too.  I can't think 
of many places where you'd like to have this information, as the most 
interesting user of it would be the inliner.  There aren't that many 
transformations which make a former indefinite class type definite, and 
most of them can be done when the C++ frontend is still around to ask.

If you were to implement something like virtual functions into the middle 
end, it should be expressed in a fairly low level way IMHO.  E.g. a 
virtual table simply being a vector of pointers to function decls (which 
we can express already just fine).  That way they could also be written 
out for LTO and read back in, and the question what function decl is 
connected to what slot can also be answered trivially.  Then definite 
class type merely has the characteristic that they can point to such a 
function table, whereas indefinite class types (i.e. those whose runtime 
type can be any derived one) can not.  E.g. I wouldn't try to model the 
inheritance relationship.

But even with that I don't really see the need for new tree nodes.  
Pointer to members are a fancy offset, so why not model them as such?  
It's obviously possible I'm missing something, in that case, please 
educate me where the problems are ... :-)


Ciao,
Michael.

Re: [RFC] Migrate pointers to members to the middle end

Reply via email to