Re: [PATCH] New IPA-CP with real function cloning - updated version

2011-07-15 Thread Jan Hubicka
> Thanks, however, I'm not confident committing this on Friday when I'm
> going to be offline until Monday noon :-) Moreover, I already have a

I will be flying back to Europe, so you even can't push responsibility to me :)

> newer version that should handle aliases to thunks.  The changes since
> the yesterday's version are:
> 
>   - initialize_node_lattices asserting node has body
>   - propagate_constants_accross_call more sensible with regard to
> thunks (and aliases).
>   - gather_caller_stats and has_undead_caller_from_outside_scc_p
> (hooks to cgraph_for_node_and_aliases) are careful about aliases
> to thunks too.
>   - all cgraph_function_or_thunk_node turned into cgraph_function_node
> because we should be propagating across these calls fine now.
> 
> All these changes are only in ipa-cp.c so only that file is verbatim
> below now (and a traditional context diff in an attachment).  I have
> bootstrapped and profiledbootstrapped and tested a very similar patch
> on x86_64-linux, bootstrap and profiledbootstrap of exactly this patch
> underway.  I hope this will be OK for trunk on Monday too if both
> pass.

These changes seems fine.  I declare the patch OK on Monday then ;))

Honza


Re: [PATCH] New IPA-CP with real function cloning

2011-07-15 Thread Martin Jambor
Hi,

On Thu, Jul 14, 2011 at 11:28:32PM +0200, Jan Hubicka wrote:
> > 
> > Well, technically they survive until after inlining (because of
> > indirect inlining which also derives information from the lattices
> > corresponding to node->inlined_to node.  Results of arithmetic
> > functions are not going to be accessed during inlining when compiling
> > any reasonable program but...
> 
> Hmm, this sounds bad.  We should move it to GTY then incrementally.  I however
> though that indirect inlining looks only at jump functions, not at lattices?

You are right, that is however basically an omission of mine - it was
supposed to handle the case below but I do not look into the
lattices.  I will fix that later.

Nevertheless, the calls to ipa_cst_from_jfunc from
ipa-inline-analyzis.c do access the values of lattices.  Potentially
all of them.  So I will GTY lattices too.

Thanks,

Martin


/* Verify that simple indirect calls are inlined even without early
   inlining..  */
/* { dg-do compile } */
/* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining"  } */

extern void non_existent(int);

int __attribute__ ((noinline,noclone)) get_input(void)
{
  return 1;
}

static void hooray ()
{
  non_existent (1);
}

void hip2 (void (*g)())
{
  non_existent (2);
  g ();
}

static void
__attribute__ ((noinline))
hip1 (void (*f)(void (*)()), void (*g)())
{
  non_existent (2);
  f (g);
}

int main (int argc, int *argv[])
{
  int i;

  for (i = 0; i < get_input (); i++)
hip1 (hip2, hooray);
  return 0;
}

/* { dg-final { scan-ipa-dump "hooray\[^\\n\]*inline copy in main" "inline"  } 
} */
/* { dg-final { scan-ipa-dump "hip2\[^\\n\]*inline copy in main" "inline"  } } 
*/
/* { dg-final { cleanup-ipa-dump "inline" } } */


Re: [PATCH] New IPA-CP with real function cloning - updated version

2011-07-14 Thread Jan Hubicka
> 2011-07-14  Martin Jambor  
> 
>   * ipa-prop.h: Include alloc-pool.h, all sorts of updates to general
>   comments.
>   (ipcp_values_pool): Declare.
>   (ipcp_sources_pool): Likewise.
>   (ipcp_lattice): Changed to forward declaration.
>   (ipa_param_descriptor): Removed fields ipcp_lattice, types and
>   cannot_devirtualize.
>   (ipa_node_params): New fields descriptors, lattices, known_vals,
>   clone_for_all_contexts and node dead, removed fields params and
>   count_scale.
>   (ipa_set_param_count): Removed.
>   (ipa_get_param_count): Made to work with descriptors vector.
>   (ipa_get_param): Updated.
>   (ipa_param_cannot_devirtualize_p): Removed.
>   (ipa_param_types_vec_empty): Likewise.
>   (ipa_set_param_used): New function.
>   (ipa_get_param_used): Updated to use descriptors vector.
>   (ipa_func_list): Removed.
>   (ipa_init_func_list): Removed declaration.
>   (ipa_push_func_to_list_1): Likewise.
>   (ipa_pop_func_from_list): Likewise.
>   (ipa_push_func_to_list): Removed.
>   (ipa_lattice_from_jfunc): Remove declaration.
>   (ipa_get_jf_pass_through_result): Declare.
>   (ipa_get_jf_ancestor_result): Likewise.
>   (ipa_value_from_jfunc): Likewise.
>   (ipa_get_lattice): Update.
>   (ipa_lat_is_single_const): New function.
>   * ipa-prop.c (ipa_push_func_to_list_1): Removed.
>   (ipa_init_func_list): Likewise.
>   (ipa_pop_func_from_list): Likewise.
>   (ipa_get_param_decl_index): Fix coding style.
>   (count_formal_params): Removed.
>   (count_formal_params_1): Renamed to count_formal_params.
>   (ipa_populate_param_decls): Update to use descriptors vector.
>   (ipa_initialize_node_params): Likewise.
>   (visit_ref_for_mod_analysis): Use ipa_set_param_used.
>   (ipa_analyze_params_uses): Likewise.
>   (ipa_free_node_params_substructures): Likewise and free also lattices
>   and known values.
>   (duplicate_array): Removed.
>   (ipa_edge_duplication_hook): Add the new edge to the list of edge
>   clones.
>   (ipa_node_duplication_hook): Update to use new lattices.
>   (ipa_free_all_structures_after_ipa_cp): Free alloc pools.
>   (ipa_free_all_structures_after_iinln): Likewise.
>   (ipa_write_node_info): Update to use new lattices.
>   (ipa_read_node_info): Likewise.
>   (ipa_get_jf_pass_through_result): New function.
>   (ipa_get_jf_ancestor_result): Likewise.
>   (ipa_value_from_jfunc): Likewise.
>   (ipa_cst_from_jfunc): Reimplemented using ipa_value_from_jfunc.
>   * ipa-cp.c: Reimplemented.
>   * params.def (PARAM_DEVIRT_TYPE_LIST_SIZE): Removed.
>   (PARAM_IPA_CP_VALUE_LIST_SIZE): New parameter.
>   (PARAM_IPA_CP_EVAL_THRESHOLD): Likewise.
>   * Makefile.in (IPA_PROP_H): Added alloc-pool.h to dependencies.
> 
>   * doc/invoke.texi (devirt-type-list-size): Removed description.
>   (ipa-cp-value-list-size): Added description.
> 
>   * testsuite/gcc.dg/ipa/ipa-1.c: Updated testcase dump scan.
>   * testsuite/gcc.dg/ipa/ipa-2.c: Likewise.
>   * testsuite/gcc.dg/ipa/ipa-3.c: Likewise and made functions static.
>   * testsuite/gcc.dg/ipa/ipa-4.c: Updated testcase dump scan.
>   * testsuite/gcc.dg/ipa/ipa-5.c: Likewise.
>   * testsuite/gcc.dg/ipa/ipa-7.c: Likewise.
>   * testsuite/gcc.dg/ipa/ipa-8.c: Updated testcase dump scan.
>   * testsuite/gcc.dg/ipa/ipacost-1.c: Likewise.
>   * testsuite/gcc.dg/ipa/ipacost-2.c: Likewise and increased sizes
>   of some functions.
>   * testsuite/gcc.dg/ipa/ipcp-1.c: New test.
>   * testsuite/gcc.dg/ipa/ipcp-2.c: Likewise.
>   * testsuite/gcc.dg/tree-ssa/ipa-cp-1.c: Updated testcase.
> 
> 
> /* ipa_node_params stores information related to formal parameters of 
> functions
>and some other information for interprocedural passes that operate on
>parameters (such as ipa-cp).  */
> 
> struct ipa_node_params
> {
>   /* Information about individual formal parameters that are gathered when
>  summaries are generated. */
>   VEC (ipa_param_descriptor_t, heap) *descriptors;
>   /* Pointer to an array of structures describing individual formal
>  parameters.  */
>   struct ipcp_lattice *lattices;

Hmm, I was under impression that the plan was to break away the stuff used by
ipa-cp internally during the propagatoin stage (i.e. ipcp_orig_node/known_vals
and probably lattices from the stuff used to represent parameters and jump
functions, i.e. descriptors.

But this can be handled incrementally.

>   /* FIXME: Can we clobber only the first argument of thunks?  */

Well, we should know how to propagate through it.  But it is not too important 
side case I guess
untill we can devirtualize to them effectively.
>   if (node->alias || node->thunk.thunk_p
>   || ipa_is_called_with_var_arguments (info))
> disable = true;

The patch is OK, thanks!
It would be nice to add

Re: [PATCH] New IPA-CP with real function cloning

2011-07-14 Thread Jan Hubicka
> 
> Well, technically they survive until after inlining (because of
> indirect inlining which also derives information from the lattices
> corresponding to node->inlined_to node.  Results of arithmetic
> functions are not going to be accessed during inlining when compiling
> any reasonable program but...

Hmm, this sounds bad.  We should move it to GTY then incrementally.  I however
though that indirect inlining looks only at jump functions, not at lattices?
> > 
> > __attribute__ ((really_bad_attribute))
> > function (int param)
> > {
> >   use param
> > }
> > 
> > and then let ipa-cp to invent param is a constant.
> 
> what would be such a "really_bad_attribute" ? 

Well, we need to go through the attribute list and prepare list of "bad guys"
instead of forbidding any attribute.
Obviously we should not give up on "pure" attribute, for example.

Another class of attributes are those referring to function arguments that needs
updating if they are still in use after clonning. I think this was original 
reason
for adding the check.

I am not sure if we have attributes that should prevent clonning completely.
> > > > 
> > > > can_change_sigunature will also handle apply_args.
> > > > Add VA_START code there, too.  For the other use of this flag (in i386) 
> > > > VA_START
> 
> The last one already is VA_START... or do you mean a different one?

I meant the code in ipa-inline-analysis.c doing the same checks but skiiping 
va_start since
i386 backend tests it by itself.
> > BTW currently the edges from thunks miss any profile info.
> > (i.e. it will be 0). I guess we ought to make ipa-profile pass to estimate 
> > those
> > (it is difficult ot measure count of an alias).
> > 
> 
> I'm not really looking at the edges from thunks to the actual
> function.  OTOH, I assume that edges to a thunk do have a count and
> look at that.

They do have (unless they are thunk to thunk edges), but in any case we ought to
regularize things here, sooner or later someone will get confused with counts
missing in the callgraph.
> 
> > If you walk only hot edges, then you need to make your function descent into
> > both alias refs and thunks calls, or the aliases of thunks will not be seen
> > then.
> 
> Well, looking at bits of the patch again now, aliases to thunks might
> indeed be a problem for a few pieces of it.  I'll send the patch
> nevertheless and ponder about this problem later.

Hmm, please do :)
I will look at the updated patch.

Honza
> 
> Thanks,
> 
> Martin


Re: [PATCH] New IPA-CP with real function cloning

2011-07-14 Thread Jan Hubicka
> > >   if (dec < cs->count)
> > >   cs->count -= dec;
> > >   else
> > >   cs->count = 0;
> > > }
> > > 
> > >   if (dump_file)
> > > dump_profile_updates (orig_node, new_node);
> > > }
> > > 
> > >   if (node->local.can_change_signature)
> > > {
> > >   args_to_skip = BITMAP_GGC_ALLOC ();
> > >   for (i = 0; i < count; i++)
> > >   {
> > > tree t = VEC_index (tree, known_vals, i);
> > > 
> > > if ((t && TREE_CODE (t) != TREE_BINFO)
> > > || !ipa_is_param_used (info, i))
> > >   bitmap_set_bit (args_to_skip, i);
> > >   }
> > > }
> > >   else
> > > args_to_skip = NULL;
> > When we can't change signature, still we can set is_parm_unused flag for 
> > the callee
> > to aid later optimizers.
> 
> I assume I can re-use the node->local.can_change_signature flag?  Is
> that supposed to be set at any given place or can IPA-CP do it on its own?

can_change_signature is currently used by i386 backend and it is set by inliner.
I plan to move it to visibility pass at the time local functions are dentified.
So yes, you can assume it is set and up to date at the time IPA-CP is run.

Honza
> 
> > 
> > Rest of patch looks OK. It definitely reads better than previous ipa-cp.c ;)
> > I suppose we will need to get some experience with the logic deciding 
> > whether to clone..
> 
> 
> Thanks, I'll post the current version in a moment.
> 
> Martin


Re: [PATCH] New IPA-CP with real function cloning

2011-07-14 Thread Martin Jambor
Hi,

like with the previous mail, I'll reply to the comments here and then
post a new version of the patch in a separate thread.

On Sun, Jul 10, 2011 at 07:04:21PM +0200, Jan Hubicka wrote:
> > 
> > /* If checking is enabled, verify that no lattice is in the TOP state, i.e. 
> > not
> >bottom, not containing a variable component and without any known value 
> > at
> >the same time.  */
> > 
> > static void
> > verify_propagated_values (void)
> > {
> > #ifdef ENABLE_CHECKING
> 
> Hmm, would not be better to keep this function around to be called from 
> debugger, like
> other verifiers do?

OK.

> >   struct cgraph_node *node;
> > 
> >   FOR_EACH_DEFINED_FUNCTION (node)
> > {
> >   struct ipa_node_params *info = IPA_NODE_REF (node);
> >   int i, count = ipa_get_param_count (info);
> > 
> >   for (i = 0; i < count; i++)
> > {
> >   struct ipcp_lattice *lat = ipa_get_lattice (info, i);
> > 
> >   if (!lat->bottom
> >   && !lat->contains_variable
> >   && lat->values_count == 0)
> > {
> >   if (dump_file)
> > {
> >   fprintf (dump_file, "\nIPA lattices after constant "
> >"propagation:\n");
> >   print_all_lattices (dump_file, true, false);
> > }
> > 
> >   gcc_unreachable ();
> > }
> > }
> > }
> > #endif
> > }
> > 
> > /* Propagate values through a pass-through jump function JFUNC associated 
> > with
> >edge CS, taking values from SRC_LAT and putting them into DEST_LAT.  
> > SRC_IDX
> >is the index of the source parameter.  */
> > 
> > static bool
> > propagate_vals_accross_pass_through (struct cgraph_edge *cs,
> >  struct ipa_jump_func *jfunc,
> >  struct ipcp_lattice *src_lat,
> >  struct ipcp_lattice *dest_lat,
> >  int src_idx)
> > {
> >   struct ipcp_value *src_val;
> >   bool ret = false;
> > 
> >   if (jfunc->value.pass_through.operation == NOP_EXPR)
> > for (src_val = src_lat->values; src_val; src_val = src_val->next)
> >   ret |= add_value_to_lattice (dest_lat, src_val->value, cs,
> >src_val, src_idx);
> >   else if (edge_within_scc (cs))
> > ret = set_lattice_contains_variable (dest_lat);
> 
> Hmm, is the reason for not using artimetic within SCC documented somewhere?
> It seems like code someone will eventually run into and remove without much 
> of consideration.

I added a comment.

> > 
> > /* Calculate devirtualization time bonus for NODE, assuming we know 
> > KNOWN_CSTS
> >and KNOWN_BINFOS.  */
> > 
> > static int
> > devirtualization_time_bonus (struct cgraph_node *node,
> >  VEC (tree, heap) *known_csts,
> >  VEC (tree, heap) *known_binfos)
> 
> Eventually it would be nice to make this common with inliner analysis.  We 
> want to 
> increase inlining limits here too

True.

> >   /* Only bare minimum benefit for clearly un-inlineable targets.  */
> >   res += 1;
> >   callee = cgraph_get_node (target);
> >   if (!callee)
> > continue;
> 
> Hmm, when you test it here, you might probably ask if callee is analyzed and 
> inlinable then ;)

Good point, did that.

> > 
> > /* Return true if cloning NODE is a good idea, given the estimated 
> > TIME_BENEFIT
> >and SIZE_COST and with the sum of frequencies of incoming edges to the
> >potential new clone in FREQUENCIES.  */
> > 
> > static bool
> > good_cloning_opportunity_p (struct cgraph_node *node, int time_benefit,
> > int freq_sum, gcov_type count_sum, int size_cost)
> > {
> >   if (time_benefit == 0
> >   || !flag_ipa_cp_clone
> >   || !optimize_function_for_speed_p (DECL_STRUCT_FUNCTION (node->decl)))
> > return false;
> 
> Still I think cloning oppurtunity is good if the saving on call size reduce 
> enough
> to pay back for function body duplication.
> It would probably make sense then to create simple wrapper functions instead 
> of
> duplicating the function body.

I was already doing that for considering IPA-CP opportunities for all
known contexts - and the effect is more profound still now when we
consider moving costs.

On the other side, when opportunities for function specialization are
concerned, size is always considered a cost and never a benefit when
doing the calculations.  This simplifies a the code a bit because the
size savings are on the side of the caller and so it depends on the
number of call sites that actually get redirected.  That is not known
when estimating size but only we did specialization decisions for all
the callers and so it would need to be an extra thing stored along
values.  I think that since such calls should be picked up by
inlining, it's not really worth the added data and complexity.  But I
could add it later, sure.

> 
> > 
> >   gcc_checking

Re: [PATCH] New IPA-CP with real function cloning

2011-07-14 Thread Martin Jambor
Hi,

I'll send a new version of IPA-CP incorporating most of the feedback
in a new thread but let me also comment on some of the points here:

On Fri, Jul 08, 2011 at 08:24:31PM +0200, Jan Hubicka wrote:
> > > > {
> > > >   /* Pointer to an array of structures describing individual formal
> > > >  parameters.  */
> > > >   struct ipcp_lattice *lattices;
> > > 
> > > Hmm, how we get here around the need to mark this GTY(). I.e are we sure 
> > > that all the known_vals
> > > must be referneced from elsewhere at ggc time?
> > 
> > (Scalar) constants that are results of arithmetic jump functions may
> > not be referenced from elsewhere, everything else is referenced from
> > the jump functions.  If it is a problem it is already present in the
> > current IPA-CP.  ipa_node_params and lattices are not GTYed there
> > either.
> 
> Hmm, I guess it is not really problem only because the lattices are used
> only in ipa-cp so the values do not really live across GGC call.

Well, technically they survive until after inlining (because of
indirect inlining which also derives information from the lattices
corresponding to node->inlined_to node.  Results of arithmetic
functions are not going to be accessed during inlining when compiling
any reasonable program but...

> > > > 
> > > > /* ipa_edge_args stores information related to a callsite and 
> > > > particularly its
> > > >arguments.  It can be accessed by the IPA_EDGE_REF macro.  */
> > > > typedef struct GTY(()) ipa_edge_args
> > > 
> > > probably edge_summary would be my preferred name.
> > 
> > Ugh, this is the current name, we may change it later.  In any event
> > the name should probably tell that the summary is about parameters.
> 
> Hmm, OK, it is not bad name after all.

:-)

> > > 
> > > >   || !inline_summary (node)->versionable
> > > >   || TYPE_ATTRIBUTES (TREE_TYPE (node->decl))
> > > 
> > > This is getting an issue for Fortran that attach the function arguments 
> > > quite commonly,
> > > perhaps we should start moving ahead on this and ruling out only the 
> > > arguments that
> > > gets can't be preserved.
> > 
> > Yes, for example in 433.milc basically all the functions are
> > considered not versionable because of this.  I also thought of not
> > removing parameters for such functions.
> > 
> > > Also you need to test attributes of DECL itself.
> > 
> > Really?  We are not doing it now, nether for IPA-CP nor for IPA-SRA
> > (which is good at triggering problems with these).
> 
> Hmm, all the ipa-inline code checks DECL_ATTRIBUTES only.  I believe the type
> attributes ends up being copied to decl attributes but not vice versa.
> I guess the testcase should be
> 
> __attribute__ ((really_bad_attribute))
> function (int param)
> {
>   use param
> }
> 
> and then let ipa-cp to invent param is a constant.

what would be such a "really_bad_attribute" ? 

> > 
> > > 
> > > I think this all should be part of can_change_signature_p code, since we 
> > > can version
> > > with all attributes I can thunk of when we don't change signature.
> > > 
> > > >   || cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
> > > > res =  false;
> > > >   else
> > > > /* Removing arguments doesn't work if the function takes varargs
> > > >or use __builtin_apply_args.
> > > >FIXME: handle this together with can_change_signature flag.  */
> > > > for (edge = node->callees; edge; edge = edge->next_callee)
> > > >   {
> > > > tree t = edge->callee->decl;
> > > > if (DECL_BUILT_IN_CLASS (t) == BUILT_IN_NORMAL
> > > > && (DECL_FUNCTION_CODE (t) == BUILT_IN_APPLY_ARGS
> > > > || DECL_FUNCTION_CODE (t) == BUILT_IN_VA_START))
> > > 
> > > can_change_sigunature will also handle apply_args.
> > > Add VA_START code there, too.  For the other use of this flag (in i386) 
> > > VA_START

The last one already is VA_START... or do you mean a different one?

> > > rules out the change, too, but so far we explicitely tested that in the 
> > > backend.
> > > 
> > > Iguess with these changes, inline_summary (node)->versionable should 
> > > coincide
> > > with IPA_NODE_REF (node)->node_versionable making this whole code 
> > > unnecesary
> > > (it was supposed to work this way, I am not sure why we now have to 
> > > versionable
> > > flags).
> > 
> > That would be nice.  I was wondering why the difference between the
> > two.  Again, I am yet to decide whether this should be done as a
> > followup or within this change.
> 
> OK.  

BTW, it will be a followup change.

> > > If you use for_node_thunk_and_aliases you can remove the recursion you do 
> > > above
> > > and it will work for aliases of thunk correctly, too ;)
> > 
> > But I wouldn't be able to check the edge for hotness.
> 
> BTW currently the edges from thunks miss any profile info.
> (i.e. it will be 0). I guess we ought to make ipa-profile pass to estimate 
> those
> (it is difficult ot measure count of an alias).
> 

I'm

Re: [PATCH] New IPA-CP with real function cloning

2011-07-10 Thread Jan Hubicka
> 
> /* If checking is enabled, verify that no lattice is in the TOP state, i.e. 
> not
>bottom, not containing a variable component and without any known value at
>the same time.  */
> 
> static void
> verify_propagated_values (void)
> {
> #ifdef ENABLE_CHECKING

Hmm, would not be better to keep this function around to be called from 
debugger, like
other verifiers do?
>   struct cgraph_node *node;
> 
>   FOR_EACH_DEFINED_FUNCTION (node)
> {
>   struct ipa_node_params *info = IPA_NODE_REF (node);
>   int i, count = ipa_get_param_count (info);
> 
>   for (i = 0; i < count; i++)
>   {
> struct ipcp_lattice *lat = ipa_get_lattice (info, i);
> 
> if (!lat->bottom
> && !lat->contains_variable
> && lat->values_count == 0)
>   {
> if (dump_file)
>   {
> fprintf (dump_file, "\nIPA lattices after constant "
>  "propagation:\n");
> print_all_lattices (dump_file, true, false);
>   }
> 
> gcc_unreachable ();
>   }
>   }
> }
> #endif
> }
> 
> /* Propagate values through a pass-through jump function JFUNC associated with
>edge CS, taking values from SRC_LAT and putting them into DEST_LAT.  
> SRC_IDX
>is the index of the source parameter.  */
> 
> static bool
> propagate_vals_accross_pass_through (struct cgraph_edge *cs,
>struct ipa_jump_func *jfunc,
>struct ipcp_lattice *src_lat,
>struct ipcp_lattice *dest_lat,
>int src_idx)
> {
>   struct ipcp_value *src_val;
>   bool ret = false;
> 
>   if (jfunc->value.pass_through.operation == NOP_EXPR)
> for (src_val = src_lat->values; src_val; src_val = src_val->next)
>   ret |= add_value_to_lattice (dest_lat, src_val->value, cs,
>  src_val, src_idx);
>   else if (edge_within_scc (cs))
> ret = set_lattice_contains_variable (dest_lat);

Hmm, is the reason for not using artimetic within SCC documented somewhere?
It seems like code someone will eventually run into and remove without much of 
consideration.
> 
> /* Calculate devirtualization time bonus for NODE, assuming we know KNOWN_CSTS
>and KNOWN_BINFOS.  */
> 
> static int
> devirtualization_time_bonus (struct cgraph_node *node,
>VEC (tree, heap) *known_csts,
>VEC (tree, heap) *known_binfos)

Eventually it would be nice to make this common with inliner analysis.  We want 
to 
increase inlining limits here too
>   /* Only bare minimum benefit for clearly un-inlineable targets.  */
>   res += 1;
>   callee = cgraph_get_node (target);
>   if (!callee)
>   continue;

Hmm, when you test it here, you might probably ask if callee is analyzed and 
inlinable then ;)
> 
> /* Return true if cloning NODE is a good idea, given the estimated 
> TIME_BENEFIT
>and SIZE_COST and with the sum of frequencies of incoming edges to the
>potential new clone in FREQUENCIES.  */
> 
> static bool
> good_cloning_opportunity_p (struct cgraph_node *node, int time_benefit,
>   int freq_sum, gcov_type count_sum, int size_cost)
> {
>   if (time_benefit == 0
>   || !flag_ipa_cp_clone
>   || !optimize_function_for_speed_p (DECL_STRUCT_FUNCTION (node->decl)))
> return false;

Still I think cloning oppurtunity is good if the saving on call size reduce 
enough
to pay back for function body duplication.
It would probably make sense then to create simple wrapper functions instead of
duplicating the function body.

> 
>   gcc_checking_assert (size_cost >= 0);
> 
>   /* FIXME:  These decisions need tuning.  */
>   if (max_count)
> {
>   int evaluation, factor = (count_sum * 1000) / max_count;
> 
>   evaluation = (time_benefit * factor) / size_cost;
> 
>   if (dump_file && (dump_flags & TDF_DETAILS))
>   fprintf (dump_file, " good_cloning_opportunity_p (time: %i, "
>"size: %i, count_sum: " HOST_WIDE_INT_PRINT_DEC
>") -> evaluation: %i, threshold: %i\n",
>time_benefit, size_cost, (HOST_WIDE_INT) count_sum,
>evaluation, 500);
> 
>   return evaluation > 500;

Hmm, the magic numbers looks a bit scary... putting size and time into fraction
causes problem that the units are not really related.

But we will see.  I guess we will need the 500 as --param value however.

We probably want ipa-profile to collect expected running time of the unit
and then we can do something like computing relative speedup of unit versus
relative growth of it that is sort of better defined  and closer to
temperature metrics. (but of course off reality)
> }
>   else
> {
>   int evaluation = (time_benefit * freq_sum) / size_cost;
> 
>   if (dump_file && (dump_flags & TDF_DETAILS))
>  

Re: [PATCH] New IPA-CP with real function cloning

2011-07-08 Thread Jan Hubicka
> > > /* Structure holding data required to describe a pass-through jump 
> > > function.  */
> > > 
> > > struct GTY(()) ipa_pass_through_data
> > > {
> > >   /* If an operation is to be performed on the original parameter, this 
> > > is the
> > >  second (constant) operand.  */
> > >   tree operand;
> > >   /* Number of the caller's formal parameter being passed.  */
> > >   int formal_id;
> > 
> > I probably should use this in ipa-inline-analsysi where I call it for some 
> > reason operand_num :)
> 
> So far I have resisted the urge to rename this but it pre-dates my
> involvement with gcc.  I'd like it to be called parm_index but since
> we might want to use it also for global variables and we might need
> something more complex for also handling parts of aggregates, I left
> it for later.

parm_index sounds to me good, too.  formal_id is the name used in paper
so it makes sense, but its meaning is unobvious.
We could rename it later, together with ipa-inline-analysis one
(I think consistency here is more important).

Yep, we will have to see how the jfuncs will look like once they
reffer to global vars and parts of agregates..
> 
> > 
> > > 
> > > struct ipcp_value;
> > 
> > I wonder if the jump functions used by several passes and ipcp
> > internal types both has o go into the same header?
> 
> Well, I originally wanted ipa-prop to provide services to the outside
> world like ipa-inline-analysis.c and to have ipa-cp self-contained.
> But I guess the data separation is more important.  So if we move
> ipa_cst_from_jfunc (and its 3 friends) to ipa-cp and move
> ipcp_lattice.decl and ipcp_lattice.used to a special structure, I
> might move ipcp_value_source, ipcp_value, and ipcp_lattice altogether
> to ipa-cp.

I also think that having ipa-prop as a generic module for propagating&
looking into args is the goal.
Separating datastructures form ipa-cp definitely makes this more obvious.

I've done that to the inliner, too. inline_summary is now all about the
function body size/time estimates and the inliner heuristics do have
their own datastructures in their own space.
> 
> At the moment I'm not sure whether I want to do this as a followup
> patch or incorporate it in the changes.  I think I'll start with the
> latter and revert to the former if it is too invasive to parts which
> have not been touched so far by the change.

Lets see, I am happy with the patch going in with current organization of
datastructures if we move into privatizing it later.
> > > {
> > >   /* Pointer to an array of structures describing individual formal
> > >  parameters.  */
> > >   struct ipcp_lattice *lattices;
> > 
> > Hmm, how we get here around the need to mark this GTY(). I.e are we sure 
> > that all the known_vals
> > must be referneced from elsewhere at ggc time?
> 
> (Scalar) constants that are results of arithmetic jump functions may
> not be referenced from elsewhere, everything else is referenced from
> the jump functions.  If it is a problem it is already present in the
> current IPA-CP.  ipa_node_params and lattices are not GTYed there
> either.

Hmm, I guess it is not really problem only because the lattices are used
only in ipa-cp so the values do not really live across GGC call.

Well, this will be solved by separating out the ipa-cp datastructures, so
it is not a problem.
> 
> > I would also slowly switch those things to VECtors..
> 
> Perhaps, but individual lattices are and always have been accessed
> through ipa_get_lattice which checks bounds and so there's no big
> reason to do that.

Yep, I added the bounds check when I was debugging.  Not big deal,
just we sort of do have agreement using our VECtor API where it fits..
> 
> > 
> > >   /* Only for versioned nodes this field would not be NULL,
> > >  it points to the node that IPA cp cloned from.  */
> > >   struct cgraph_node *ipcp_orig_node;
> > Why not use node->clone_of here?
> 
> That would not work if the node was a clone created by some other
> pass.  I need to differentiate between clones I create because they do
> not have lattices but do have the exact values for individual
> parameters in known_vals (which is NULL otherwise).  I should probably
> use a flag though, the code I ended up only checks it for NULL anyway.

Hmm, OK, either flag or keeping the pointer is fine.
> > > 
> > > /* ipa_edge_args stores information related to a callsite and 
> > > particularly its
> > >arguments.  It can be accessed by the IPA_EDGE_REF macro.  */
> > > typedef struct GTY(()) ipa_edge_args
> > 
> > probably edge_summary would be my preferred name.
> 
> Ugh, this is the current name, we may change it later.  In any event
> the name should probably tell that the summary is about parameters.

Hmm, OK, it is not bad name after all.
> 
> > 
> > > {
> > >   /* Next pointer in a linked list of clones of the same function.  */
> > >   struct cgraph_edge *next_edge_clone;
> > 
> > What this is needed for?
> 
> For get_info_about_necessary_edges and ga

Re: [PATCH] New IPA-CP with real function cloning

2011-07-08 Thread Martin Jambor
Hi,

On Thu, Jul 07, 2011 at 06:03:07PM +0200, Jan Hubicka wrote:
> Hi,
> patch is long, so let me review it in more passes.

Fair enough.

> > 
> > 
> > 2011-06-22  Martin Jambor  
> > 
> > * ipa-prop.h: Include alloc-pool.h.
> > (ipa_lattice_type): Removed.
> > (ipcp_value_source): New type.
> > (ipcp_value): Likewise.
> > (ipcp_values_pool): Declare.
> > (ipcp_sources_pool): Likewise.
> > (ipa_param_descriptor): Removed.
> > (ipcp_lattice): Removed fileds type and constant. Added fields decl,
> > values, values_count, contains_variable, bottom, used and virt_call.
> > (ipa_node_params): New fields lattices, known_vals,
> > clone_for_all_contexts and noe dead, removed fields params and
> > count_scale.
> > (ipa_get_param): Updated.
> > (ipa_param_cannot_devirtualize_p): Removed.
> > (ipa_param_types_vec_empty): Likewise.
> > (ipa_edge_args): New field next_edge_clone.
> > (ipa_func_list): Removed.
> > (ipa_init_func_list): Removed declaration.
> > (ipa_push_func_to_list_1): Likewise.
> > (ipa_pop_func_from_list): Likewise.
> > (ipa_push_func_to_list): Removed.
> > (ipa_lattice_from_jfunc): Remove declaration.
> > (ipa_get_jf_pass_through_result): Declare.
> > (ipa_get_jf_ancestor_result): Likewise.
> > (ipa_value_from_jfunc): Likewise.
> > (ipa_get_lattice): Update.
> > (ipa_lat_is_single_const): New function.
> > * ipa-prop.c (ipa_push_func_to_list_1): Removed.
> > (ipa_init_func_list): Likewise.
> > (ipa_pop_func_from_list): Likewise.
> > (ipa_get_param_decl_index): Fix coding style.
> > (ipa_populate_param_decls): Update to use new lattices.
> > (ipa_initialize_node_params): Likewise.
> > (visit_ref_for_mod_analysis): Likewise.
> > (ipa_analyze_params_uses): Likewise.
> > (ipa_free_node_params_substructures): Likewise.
> > (ipa_edge_duplication_hook): Add the new edge to the list of edge
> > clones.
> > (ipa_node_duplication_hook): Update to use new lattices.
> > (ipa_free_all_structures_after_ipa_cp): Free alloc pools.
> > (ipa_free_all_structures_after_iinln): Likewise.
> > (ipa_write_node_info): Update to use new lattices.
> > (ipa_read_node_info): Likewise.
> > (ipa_get_jf_pass_through_result): New function.
> > (ipa_get_jf_ancestor_result): Likewise.
> > (ipa_value_from_jfunc): Likewise.
> > (ipa_cst_from_jfunc): Reimplemented using ipa_value_from_jfunc.
> > * ipa-cp.c: Reimplemented.
> > * params.def (PARAM_DEVIRT_TYPE_LIST_SIZE): Removed.
> > (PARAM_IPA_CP_VALUE_LIST_SIZE): New parameter.
> > * Makefile.in (IPA_PROP_H): Added alloc-pool.h to dependencies.
> > 
> > * doc/invoke.texi (devirt-type-list-size): Removed description.
> > (ipa-cp-value-list-size): Added description.
> > 
> > * testsuite/gcc.dg/ipa/ipa-1.c: Updated testcase dump scan.
> > * testsuite/gcc.dg/ipa/ipa-2.c: Likewise.
> > * testsuite/gcc.dg/ipa/ipa-3.c: Likewise and made functions static.
> > * testsuite/gcc.dg/ipa/ipa-4.c: Updated testcase dump scan.
> > * testsuite/gcc.dg/ipa/ipa-5.c: Likewise.
> > * testsuite/gcc.dg/ipa/ipa-7.c: Xfail test.
> > * testsuite/gcc.dg/ipa/ipa-8.c: Updated testcase dump scan.
> > * testsuite/gcc.dg/ipa/ipacost-1.c: Likewise.
> > * testsuite/gcc.dg/ipa/ipacost-2.c: Likewise.
> > * testsuite/gcc.dg/ipa/ipcp-1.c: New test.
> > * testsuite/gcc.dg/ipa/ipcp-2.c: Likewise.
> > * testsuite/gcc.dg/tree-ssa/ipa-cp-1.c: Updated testcase.
> 
> > /* Interprocedural analyses.
> >Copyright (C) 2005, 2007, 2008, 2009, 2010
> 2011
> >Free Software Foundation, Inc.
> > 
> > 
> > /* The following definitions and interfaces are used by
> >interprocedural analyses or parameters.  */
> > 
> > /* ipa-prop.c stuff (ipa-cp, indirect inlining):  */
> 
> I was bit thinking about it and probably we could make ipa-prop
> and ipa-inline-analysis to be stand alone analysis passes, instead of
> something called either from inliner or ipa-cp analysis stage. But
> that could be done incrementally.

As I said in the first introductory mail, the summary generation part
is not really affected by this patch in any serious way.

> 
> > 
> > /* A jump function for a callsite represents the values passed as actual
> >arguments of the callsite. There are three main types of values :
> > 
> >Pass-through - the caller's formal parameter is passed as an actual
> >   argument, possibly one simple operation performed on it.
> >Constant - a constant (is_gimple_ip_invariant)is passed as an actual
> >   argument.
> >Unknown  - neither of the above.
> > 
> >IPA_JF_CONST_MEMBER_PTR stands for C++ member pointers, it is a special
> >constant in this regard.  Other constants are represented with 
> > IPA_JF_CONST.
> 
> While we are at docs, I would bit expand. It seems to me that for someone not 
> familiar
> with the

Re: [PATCH] New IPA-CP with real function cloning

2011-07-07 Thread Jan Hubicka
Hi,
patch is long, so let me review it in more passes.
> 
> 
> 2011-06-22  Martin Jambor  
> 
>   * ipa-prop.h: Include alloc-pool.h.
>   (ipa_lattice_type): Removed.
>   (ipcp_value_source): New type.
>   (ipcp_value): Likewise.
>   (ipcp_values_pool): Declare.
>   (ipcp_sources_pool): Likewise.
>   (ipa_param_descriptor): Removed.
>   (ipcp_lattice): Removed fileds type and constant. Added fields decl,
>   values, values_count, contains_variable, bottom, used and virt_call.
>   (ipa_node_params): New fields lattices, known_vals,
>   clone_for_all_contexts and noe dead, removed fields params and
>   count_scale.
>   (ipa_get_param): Updated.
>   (ipa_param_cannot_devirtualize_p): Removed.
>   (ipa_param_types_vec_empty): Likewise.
>   (ipa_edge_args): New field next_edge_clone.
>   (ipa_func_list): Removed.
>   (ipa_init_func_list): Removed declaration.
>   (ipa_push_func_to_list_1): Likewise.
>   (ipa_pop_func_from_list): Likewise.
>   (ipa_push_func_to_list): Removed.
>   (ipa_lattice_from_jfunc): Remove declaration.
>   (ipa_get_jf_pass_through_result): Declare.
>   (ipa_get_jf_ancestor_result): Likewise.
>   (ipa_value_from_jfunc): Likewise.
>   (ipa_get_lattice): Update.
>   (ipa_lat_is_single_const): New function.
>   * ipa-prop.c (ipa_push_func_to_list_1): Removed.
>   (ipa_init_func_list): Likewise.
>   (ipa_pop_func_from_list): Likewise.
>   (ipa_get_param_decl_index): Fix coding style.
>   (ipa_populate_param_decls): Update to use new lattices.
>   (ipa_initialize_node_params): Likewise.
>   (visit_ref_for_mod_analysis): Likewise.
>   (ipa_analyze_params_uses): Likewise.
>   (ipa_free_node_params_substructures): Likewise.
>   (ipa_edge_duplication_hook): Add the new edge to the list of edge
>   clones.
>   (ipa_node_duplication_hook): Update to use new lattices.
>   (ipa_free_all_structures_after_ipa_cp): Free alloc pools.
>   (ipa_free_all_structures_after_iinln): Likewise.
>   (ipa_write_node_info): Update to use new lattices.
>   (ipa_read_node_info): Likewise.
>   (ipa_get_jf_pass_through_result): New function.
>   (ipa_get_jf_ancestor_result): Likewise.
>   (ipa_value_from_jfunc): Likewise.
>   (ipa_cst_from_jfunc): Reimplemented using ipa_value_from_jfunc.
>   * ipa-cp.c: Reimplemented.
>   * params.def (PARAM_DEVIRT_TYPE_LIST_SIZE): Removed.
>   (PARAM_IPA_CP_VALUE_LIST_SIZE): New parameter.
>   * Makefile.in (IPA_PROP_H): Added alloc-pool.h to dependencies.
> 
>   * doc/invoke.texi (devirt-type-list-size): Removed description.
>   (ipa-cp-value-list-size): Added description.
> 
>   * testsuite/gcc.dg/ipa/ipa-1.c: Updated testcase dump scan.
>   * testsuite/gcc.dg/ipa/ipa-2.c: Likewise.
>   * testsuite/gcc.dg/ipa/ipa-3.c: Likewise and made functions static.
>   * testsuite/gcc.dg/ipa/ipa-4.c: Updated testcase dump scan.
>   * testsuite/gcc.dg/ipa/ipa-5.c: Likewise.
>   * testsuite/gcc.dg/ipa/ipa-7.c: Xfail test.
>   * testsuite/gcc.dg/ipa/ipa-8.c: Updated testcase dump scan.
>   * testsuite/gcc.dg/ipa/ipacost-1.c: Likewise.
>   * testsuite/gcc.dg/ipa/ipacost-2.c: Likewise.
>   * testsuite/gcc.dg/ipa/ipcp-1.c: New test.
>   * testsuite/gcc.dg/ipa/ipcp-2.c: Likewise.
>   * testsuite/gcc.dg/tree-ssa/ipa-cp-1.c: Updated testcase.

> /* Interprocedural analyses.
>Copyright (C) 2005, 2007, 2008, 2009, 2010
2011
>Free Software Foundation, Inc.
> 
> 
> /* The following definitions and interfaces are used by
>interprocedural analyses or parameters.  */
> 
> /* ipa-prop.c stuff (ipa-cp, indirect inlining):  */

I was bit thinking about it and probably we could make ipa-prop
and ipa-inline-analysis to be stand alone analysis passes, instead of
something called either from inliner or ipa-cp analysis stage. But
that could be done incrementally.

> 
> /* A jump function for a callsite represents the values passed as actual
>arguments of the callsite. There are three main types of values :
> 
>Pass-through - the caller's formal parameter is passed as an actual
>   argument, possibly one simple operation performed on it.
>Constant - a constant (is_gimple_ip_invariant)is passed as an actual
>   argument.
>Unknown  - neither of the above.
> 
>IPA_JF_CONST_MEMBER_PTR stands for C++ member pointers, it is a special
>constant in this regard.  Other constants are represented with 
> IPA_JF_CONST.

While we are at docs, I would bit expand. It seems to me that for someone not 
familiar
with the concept is not clear at all why member pointers are special.
(i.e. explain that they are non-gimple-regs etc.)

> 
>IPA_JF_ANCESTOR is a special pass-through jump function, which means that
>the result is an address of a part of the object pointed to by the formal
>parameter to wh