Re: [PATCH 1/2] ipa-cp: Better representation of aggregate values we clone for

2022-10-17 Thread Martin Jambor
Hi,

thanks for the review.

On Fri, Oct 14 2022, Jan Hubicka wrote:
>>

[...]

>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2022-08-15  Martin Jambor  
>> 
>>  * gcc.dg/ipa/ipcp-agg-11.c: Adjust dumps.
>>  * gcc.dg/ipa/ipcp-agg-8.c: Likewise.
>> ---
>>  gcc/ipa-cp.cc  | 1010 
>>  gcc/ipa-prop.cc|  254 +++---
>>  gcc/ipa-prop.h |  139 +++-
>>  gcc/testsuite/gcc.dg/ipa/ipcp-agg-11.c |4 +-
>>  gcc/testsuite/gcc.dg/ipa/ipcp-agg-8.c  |4 +-
>>  5 files changed, 736 insertions(+), 675 deletions(-)
>> 
>> diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
>> index 543a9334e2c..024f8c06b5d 100644
>> --- a/gcc/ipa-cp.cc
>> +++ b/gcc/ipa-cp.cc
>> @@ -127,6 +127,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "attribs.h"
>>  #include "dbgcnt.h"
>>  #include "symtab-clones.h"
>> +#include 
>>  
>>  template  class ipcp_value;
>>  
>> @@ -455,6 +456,26 @@ ipcp_lattice::is_single_const ()
>>  return true;
>>  }
>>  
>> +/* Return true iff X and Y should be considered equal values by IPA-CP.  */
>> +
>> +static bool
>> +values_equal_for_ipcp_p (tree x, tree y)
>> +{
>> +  gcc_checking_assert (x != NULL_TREE && y != NULL_TREE);
>> +
>> +  if (x == y)
>> +return true;
>> +
>> +  if (TREE_CODE (x) == ADDR_EXPR
>> +  && TREE_CODE (y) == ADDR_EXPR
>> +  && TREE_CODE (TREE_OPERAND (x, 0)) == CONST_DECL
>> +  && TREE_CODE (TREE_OPERAND (y, 0)) == CONST_DECL)
>> +return operand_equal_p (DECL_INITIAL (TREE_OPERAND (x, 0)),
>> +DECL_INITIAL (TREE_OPERAND (y, 0)), 0);
>
> I wonder if we want to handle MEM_REFs here too? They get quite common
> in IPA mode and I think we miss the fixup removing them here.

This patch just moves the function up without modifying it, and I'd like
to do any changes separately, unless they are required for this patch.

And just to be sure, you mean it should cover also the MEM_REF case as
in is_gimple_invariant_address, right?

>> +  else
>> +return operand_equal_p (x, y, 0);
>> +/* Return the item describing a constant stored for INDEX at UNIT_OFFSET or
>> +   NULL if there is no such constant.  */
>> +
>> +const ipa_argagg_value *
>> +ipa_argagg_value_list::get_elt (int index, unsigned unit_offset) const
>> +{
>> +  ipa_argagg_value key;
>> +  key.index = index;
>> +  key.unit_offset = unit_offset;
>> +  const ipa_argagg_value *res
>> += std::lower_bound (m_elts.begin (), m_elts.end (), key,
>> +[] (const ipa_argagg_value &elt,
>> +const ipa_argagg_value &val)
>> +{
>> +  if (elt.index < val.index)
>> +return true;
>> +  if (elt.index > val.index)
>> +return false;
>> +  if (elt.unit_offset < val.unit_offset)
>> +return true;
>> +  return false;
>> +});
>> +
>> +  if (res == m_elts.end ()
>> +  || res->index != index
>> +  || res->unit_offset != unit_offset)
>> +res = nullptr;
>> +
>> +  /* TODO: perhaps remove after some extensive testing? */
>> +  if (!flag_checking)
>> +return res;
>> +
>> +  const ipa_argagg_value *slow_res = NULL;
>> +  int prev_index = -1;
>> +  unsigned prev_unit_offset = 0;
>> +  for (const ipa_argagg_value &av : m_elts)
>> +{
>> +  gcc_assert (prev_index < 0
>> +  || prev_index < av.index
>> +  || prev_unit_offset < av.unit_offset);
>> +  prev_index = av.index;
>> +  prev_unit_offset = av.unit_offset;
>> +  if (av.index == index
>> +  && av.unit_offset == unit_offset)
>> +slow_res = &av;
>> +}
>> +  gcc_assert (res == slow_res);

> So this is just checking that the std::lower_bound works as expected?
> I am just curious if you expect it to break?

It rather checks that the underlying array on which it operates really
is sorted :-)

When I was writing this code I had not carefully checked all the places
where we construct them.  Now I am quite confident they indeed are
always sorted but still thought this would be a useful check against
future errors.  We can remove the test at any time if it ever becomes
too slow.

>> +/* Turn all values that are not present in OTHER into NULL_TREEs.  Return 
>> the
>> +   number of remaining valid entries.  */
>> +
>> +bool
>> +ipa_argagg_value_list::superset_of_p (const ipa_argagg_value_list &other) 
>> const

> It returns bool, so not number of entries.

Umm, that comment was from an entirely different function, fixed.

I also changed the names of local variables this_index and this_offset
to other_index and other_offset because that is what they really are.

>> +/* Push into RES aggregate all stored aggregate values relating to parameter
>> +   with SRC_INDEX as those relating to of DST_INDEX while subtracting
>> +   UNIT_DELTA from the individual unit offsets.  */

Re: [PATCH 1/2] ipa-cp: Better representation of aggregate values we clone for

2022-10-14 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog:
> 
> 2022-08-26  Martin Jambor  
> 
>   * ipa-prop.h (IPA_PROP_ARG_INDEX_LIMIT_BITS): New.
>   (ipcp_transformation): Added forward declaration.
>   (ipa_argagg_value): New type.
>   (ipa_argagg_value_list): New type.
>   (ipa_agg_replacement_value): Removed type.
>   (ipcp_transformation): Switch from using ipa_agg_replacement_value
>   to ipa_argagg_value_list.
>   (ipa_get_agg_replacements_for_node): Removed.
>   (ipa_dump_agg_replacement_values): Removed declaration.
>   * ipa-cp.cc: Include .
>   (values_equal_for_ipcp_p): Moved up in the file.
>   (ipa_argagg_value_list::dump): Likewise.
>   (ipa_argagg_value_list::debug): Likewise.
>   (ipa_argagg_value_list::get_elt): Likewise.
>   (ipa_argagg_value_list::get_elt_for_index): Likewise.
>   (ipa_argagg_value_list::get_value): New overloaded functions.
>   (ipa_argagg_value_list::superset_of_p): New function.
>   (new ipa_argagg_value_list::push_adjusted_values): Likewise.
>   (push_agg_values_from_plats): Likewise.
>   (intersect_argaggs_with): Likewise.
>   (get_clone_agg_value): Removed.
>   (ipa_agg_value_from_node): Make last parameter const, use
>   ipa_argagg_value_list to search values coming from clones.
>   (ipa_get_indirect_edge_target_1): Use ipa_argagg_value_list to search
>   values coming from clones.
>   (ipcp_discover_new_direct_edges): Pass around a vector of
>   ipa_argagg_values rather than a link list of replacement values.
>   (cgraph_edge_brings_value_p): Use ipa_argagg_value_list to search
>   values coming from clones.
>   (create_specialized_node): Work with a vector of ipa_argagg_values
>   rather than a link list of replacement values.
>   (self_recursive_agg_pass_through_p): Make the pointer parameters
>   const.
>   (copy_plats_to_inter): Removed.
>   (intersect_with_plats): Likewise.
>   (agg_replacements_to_vector): Likewise.
>   (intersect_with_agg_replacements): Likewise.
>   (intersect_aggregates_with_edge): Likewise.
>   (push_agg_values_for_index_from_edge): Likewise.
>   (push_agg_values_from_edge): Likewise.
>   (find_aggregate_values_for_callers_subset): Rewrite.
>   (cgraph_edge_brings_all_agg_vals_for_node): Likewise.
>   (ipcp_val_agg_replacement_ok_p): Use ipa_argagg_value_list to search
>   aggregate values.
>   (decide_about_value): Work with a vector of ipa_argagg_values rather
>   than a link list of replacement values.
>   (decide_whether_version_node): Likewise.
>   (ipa_analyze_node): Check number of parameters, assert that there
>   are no descriptors when bailing out.
>   * ipa-prop.cc (ipa_set_node_agg_value_chain): Switch to a vector of
>   ipa_argagg_value.
>   (ipa_node_params_t::duplicate): Removed superfluous handling of
>   ipa_agg_replacement_values.  Name of src parameter removed because
>   it is no longer used.
>   (ipcp_transformation_t::duplicate): Replaced duplication of
>   ipa_agg_replacement_values with copying vector m_agg_values.
>   (ipa_dump_agg_replacement_values): Removed.
>   (write_ipcp_transformation_info): Stream the new data-structure
>   instead of the old.
>   (read_ipcp_transformation_info): Likewise.
>   (adjust_agg_replacement_values): Work with ipa_argagg_values instead
>   of linked lists of ipa_agg_replacement_values, copy the items and
>   truncate the vector as necessary to keep it sorted instead of marking
>   items as invalid.  Return one bool if CFG should be updated.
>   (ipcp_modif_dom_walker): Store ipcp_transformation instead of
>   linked list of ipa_agg_replacement_values.
>   (ipcp_modif_dom_walker::before_dom_children): Use
>   ipa_argagg_value_list instead of walking a list of
>   ipa_agg_replacement_values.
>   (ipcp_transform_function): Switch to the new data structure, adjust
>   dumping.
> 
> gcc/testsuite/ChangeLog:
> 
> 2022-08-15  Martin Jambor  
> 
>   * gcc.dg/ipa/ipcp-agg-11.c: Adjust dumps.
>   * gcc.dg/ipa/ipcp-agg-8.c: Likewise.
> ---
>  gcc/ipa-cp.cc  | 1010 
>  gcc/ipa-prop.cc|  254 +++---
>  gcc/ipa-prop.h |  139 +++-
>  gcc/testsuite/gcc.dg/ipa/ipcp-agg-11.c |4 +-
>  gcc/testsuite/gcc.dg/ipa/ipcp-agg-8.c  |4 +-
>  5 files changed, 736 insertions(+), 675 deletions(-)
> 
> diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
> index 543a9334e2c..024f8c06b5d 100644
> --- a/gcc/ipa-cp.cc
> +++ b/gcc/ipa-cp.cc
> @@ -127,6 +127,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "attribs.h"
>  #include "dbgcnt.h"
>  #include "symtab-clones.h"
> +#include 
>  
>  template  class ipcp_value;
>  
> @@ -455,6 +456,26 @@ ipcp_lattice::is_single_const ()
>  return true;
>  }
>  
> +/* Return true iff X and Y should be considered e

[PATCH 1/2] ipa-cp: Better representation of aggregate values we clone for

2022-08-30 Thread Martin Jambor
Hi,

this patch replaces linked lists of ipa_agg_replacement_value with
vectors of similar structures called ipa_argagg_value and simplifies
how we compute them in the first place.  Having a vector should also
result in less overhead when allocating and because we keep it sorted,
it leads to logarithmic searches.

The slightly obnoxious "argagg" bit in the name can be changed into
"agg" after the next patch removes our current ipa_agg_value type.

The patch also introduces type ipa_argagg_value_list which serves as a
common view into a vector of ipa_argagg_value structures regardless
whether they are stored in GC memory (required for IPA-CP
transformation summary because we store trees) or in an auto_vec which
is hopefully usually only allocated on stack.

The calculation of aggreagete costant values for a given subsert of
callers is then rewritten to compute known constants for each
edge (some pruning to skip obviously not needed is still employed and
should not be really worse than what I am replacing) and these vectors
are there intersected, which can be done linearly since they are
sorted.  The patch also removes a lot of heap allocations of small
lists of aggregate values and replaces them with stack based
auto_vecs.

As Richard Sandiford suggested, I use std::lower_bound from
 rather than re-implementing bsearch for array_slice.  The
patch depends on the patch which adds the ability to construct
array_slices from gc-allocated vectors.

Bootstrapped, LTO-bootstrapped and tested on x86_64-linux (and a
slightly older version also on aarch64-linux).  LTO-profiledbootstrap is
currently underway.  Given the size of the patch I assume there will be
concerns/questions but I'm looking for an approval to commit a version
of this.

Thanks,

Martin



gcc/ChangeLog:

2022-08-26  Martin Jambor  

* ipa-prop.h (IPA_PROP_ARG_INDEX_LIMIT_BITS): New.
(ipcp_transformation): Added forward declaration.
(ipa_argagg_value): New type.
(ipa_argagg_value_list): New type.
(ipa_agg_replacement_value): Removed type.
(ipcp_transformation): Switch from using ipa_agg_replacement_value
to ipa_argagg_value_list.
(ipa_get_agg_replacements_for_node): Removed.
(ipa_dump_agg_replacement_values): Removed declaration.
* ipa-cp.cc: Include .
(values_equal_for_ipcp_p): Moved up in the file.
(ipa_argagg_value_list::dump): Likewise.
(ipa_argagg_value_list::debug): Likewise.
(ipa_argagg_value_list::get_elt): Likewise.
(ipa_argagg_value_list::get_elt_for_index): Likewise.
(ipa_argagg_value_list::get_value): New overloaded functions.
(ipa_argagg_value_list::superset_of_p): New function.
(new ipa_argagg_value_list::push_adjusted_values): Likewise.
(push_agg_values_from_plats): Likewise.
(intersect_argaggs_with): Likewise.
(get_clone_agg_value): Removed.
(ipa_agg_value_from_node): Make last parameter const, use
ipa_argagg_value_list to search values coming from clones.
(ipa_get_indirect_edge_target_1): Use ipa_argagg_value_list to search
values coming from clones.
(ipcp_discover_new_direct_edges): Pass around a vector of
ipa_argagg_values rather than a link list of replacement values.
(cgraph_edge_brings_value_p): Use ipa_argagg_value_list to search
values coming from clones.
(create_specialized_node): Work with a vector of ipa_argagg_values
rather than a link list of replacement values.
(self_recursive_agg_pass_through_p): Make the pointer parameters
const.
(copy_plats_to_inter): Removed.
(intersect_with_plats): Likewise.
(agg_replacements_to_vector): Likewise.
(intersect_with_agg_replacements): Likewise.
(intersect_aggregates_with_edge): Likewise.
(push_agg_values_for_index_from_edge): Likewise.
(push_agg_values_from_edge): Likewise.
(find_aggregate_values_for_callers_subset): Rewrite.
(cgraph_edge_brings_all_agg_vals_for_node): Likewise.
(ipcp_val_agg_replacement_ok_p): Use ipa_argagg_value_list to search
aggregate values.
(decide_about_value): Work with a vector of ipa_argagg_values rather
than a link list of replacement values.
(decide_whether_version_node): Likewise.
(ipa_analyze_node): Check number of parameters, assert that there
are no descriptors when bailing out.
* ipa-prop.cc (ipa_set_node_agg_value_chain): Switch to a vector of
ipa_argagg_value.
(ipa_node_params_t::duplicate): Removed superfluous handling of
ipa_agg_replacement_values.  Name of src parameter removed because
it is no longer used.
(ipcp_transformation_t::duplicate): Replaced duplication of
ipa_agg_replacement_values with copying vector m_agg_values.
(ipa_dump_agg_replacement_values): Removed.
(write_ipcp_transforma