Re: Cgraph Modification Plan

Jan Hubicka Wed, 05 Sep 2012 22:14:27 -0700

> On Wed, Sep 5, 2012 at 5:41 PM, Lawrence Crowl <[email protected]> wrote:
> > On 9/5/12, Xinliang David Li <[email protected]> wrote:
> >> On Sep 5, 2012 Jan Hubicka <[email protected]> wrote:
> >> > OK, the basic idea is that symtab_node is basetype of
> >> > cgraph_node and varpool_node.  We may want to drop the historica
> >> > cgraph/varpool names here, since function_node/variable_node
> >> > would sound better. Cgraph still exists, but varpool is more
> >> > or less historical relic.
> >>
> >> The cgraph redesign probably deserves more discussion.
> >
> > Hence, the post!
> >
> >> 1) It may be worthwhile to abstract the graph manipulation code
> >> into a utility class which is templatized.
> >>
> >> graph<T>, node<T> with node inheriting from T.
> >
> > While I won't dispute the statement, getting from here to there
> > will likely involve some intermediate steps.
> 
> Yes. My point is that the intermediate steps should be driven by the
> final goal. It is important to first have  the design of final class
> hierarchy and interfaces nailed and then the intermediate steps
> defined instead of the other way around.


Yes, it seems resonable to me. We just need to keep in mind the specifics
callgraph has (i.e. many types of edges that are relevant in different
contextes and have good reason for different kind of in memory reppresentation)
There are also different types of nodes - functions, variables, aliases. I plan
to add labels in 4.8 timeframe, too).
> 
> >
> >> 2) Introduce a global symbol table containing a function table
> >> and a global variable table. The function table should replace
> >> the current cgraph node link list, and the variable table replaces
> >> the varpool.  The symbol table should provide basic interfaces to
> >> do named based lookup, traversal, alias handling etc.  I noticed
> >> trunk already has some of that -- but it seems more abstraction
> >> is needed.
> >
> > Do you mean moving away from a pointer-based approach?
> 
> See above. I mean it is important to have well defined symtab
> interfaces that hide implementation as much as possible. It will make
> the interfaces more stable. It is currently quite difficult for
> cgraph/varpool related changes in gcc branches to keep up with trunk
> without stable core APIs.

Well, the APIs did not changed much over years (LTO brought in some more busy
changes, but many were just hacks that needed redesign anyway), in 4.8 we
changed quite a lot because of introduction of the symbol table. It would be
really nice to get as much as C++ conversion into 4.8, too, since we broke most
of existing patches anyway, to get more stability frm 4.9 up.

> >> 3) it seems natural to drop 'node' in the naming scheme
> >>
> >> symbol (symtab_entry) --> base class;
> >> function --> derived from symbol;  (It conflicts with the existing
> >> struct function though);
> >> variable --> derived from symbol;
> >>
> >> typedef node<function> cnode;
> >
> > A node<function> is not a derived class of node<symbol> even when
> > function is derived from symbol.  That property is helpful in
> > ensuring usable type safety.
> 
> 
> We don't need node<symbol> -- only node<function> is needed, and it is
> derived from function and function is derived from symbol.

Yes, this is where I expected to land.  I expect to see more complex inheritance
tree.  Currently we have:

symbol
  |
  +-function (cgraph)
  |
  +-variable (varpool)

While function can have five types - just external declarations, actual
definitions with body (analyzed flag is set), thunk, alias and virtual clone

Similarly variables are declarations, ones with body (analyzed flag is set),
and aliases.

We do not represent labels and const_decl references. We ought: those are also
symbols and they are important for partitioning.

We probably want rither structure here in longer term:

symbol
  |
  +-function declaration
  | |
  | +-function definition
  | |
  | +-function alternative entry point (thunk)
  | |
  | +-function alias
  | |
  | +-virtual clone
  |
  +-variable declaration
  | |
  | +-variable definition (possibly constant pool references via const_decl 
here, too)
  | |
  | +-variable alias
  |
  +-function label definition

In particular we do not want to store all the stuff we store for function
definitions for declarations.

variable/function aliases do have a lot in common (similarly for
function/variable definitions), too. The reason why I did not introduce
separate alias_node type is that function aliases are different from variable
aliases in several ways. For example, they can not be destinations of function
calls.
> 
> >
> >> 4) coding convention is needed for functions that do 'casting'
> >> and 'trial casting'.
> >
> > The functions I have suggestion are ptr_... and try_... respectively.
> > Are you suggesting amending the coding conventions?
> >
> 
> Is this something agreed upon in previous discussions or something
> newly brought up? How about using  template function in base to do
> casting:
> 
> trial cast:
> 
> template<typename T> T* symbol::cast_to(symbol* p) {
>    if (p->is<T>())
>       return static_cast<T*>(p);
>    return 0;
>  }
> 
> cast:
> 
> template<typename T> T& symbol:as(symbol* p) {
>    assert(p->is<T>())
>    return static_cast<T&>(*p);
> 
>  }

This also seems fine to me (shorter API names are preferable for me, even if I
still conisder <...> in code disturbing ;)

This is not the only case, it really depends on conding standards we employ for
GCC datastructures in general.  I suppose cgraph/varpool is just firt of several
conversions we plan to do in foreseeable future.

Another thing that calls for C++ API is the way we annotate cgraph/varpool with
additional information in the IPA passes.  I.e. the whole area using node/edge
creation/duplicatoin/deletion hooks.

Honza

Re: Cgraph Modification Plan

Reply via email to