On Sun, Nov 14, 2021 at 10:53 AM Jan Hubicka via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> > >
> > > I think you want get_addr_base_and_unit_offset here.  All
> > > variable indexed addresses are in separate stmts.  That also means
> > > you can eventually work with just byte sizes/offsets?
> >
> > Will do.  The access range in modref summary is bit based (since we want
> > to disabiguate bitfields like we do in rest of alias oracle) but indeed
> > this part cna be in bytes.
>
> Actually after the unifiation I can just use get_ao_ref which will call
> ao_ref_init_from_ptr_and_range that has all the logic I need in there.
> I also noticed that I ended up duplicating the code matching bases
> and ranges which is done already twice in the function - once for store
> targets and once for MEMSET and friends.  The later copy lacked overflow
> checks so I took the first copy and moved it to helper function.  This
> makes the gimple part of patch really straighforward: just build ao_ref
> if possible and then pass it to this function.
>
> I also added statistics.
>
> I have bootstrapped/regtsed on x86_64-linux the updated patch and
> comitted it so I can break out the patches that depends on it.
> I have patch improving the kill tracking at modref side and also the
> kill oracle itself can use fnspec and does not need to special case
> mem* functions.
>
> For cc1plus LTO link I now get:
>
> Alias oracle query stats:
>   refs_may_alias_p: 76106130 disambiguations, 100928932 queries
>   on_includes: 12539931 disambiguations, 39864841 queries
>   ref_maybe_used_by_call_p: 625857 disambiguations, 77138089 queries
>   call_may_clobber_ref_p: 366420 disambiguations, 369293 queries
>   stmt_kills_ref_p: 107503 kills, 5699589 queries
>   nonoverlapping_component_refs_p: 0 disambiguations, 26176 queries
>   nonoverlapping_refs_since_match_p: 30339 disambiguations, 65400 must 
> overlaps, 96698 queries
>   aliasing_component_refs_p: 57500 disambiguations, 15464678 queries
>   TBAA oracle: 28248334 disambiguations 104710521 queries
>                15220245 are in alias set 0
>                8905994 queries asked about the same object
>                98 queries asked about the same alias set
>                0 access volatile
>                50371110 are dependent in the DAG
>                1964740 are aritificially in conflict with void *
>
> Modref stats:
>   modref kill: 52 kills, 6655 queries
>   modref use: 25204 disambiguations, 692151 queries
>   modref clobber: 2309709 disambiguations, 21877806 queries
>   5320532 tbaa queries (0.243193 per modref query)
>   761785 base compares (0.034820 per modref query)
>
> PTA query stats:
>   pt_solution_includes: 12539931 disambiguations, 39864841 queries
>   pt_solutions_intersect: 1713075 disambiguations, 14023484 queries
>
> Newly we get statis of kill oracle itself:
>   stmt_kills_ref_p: 107503 kills, 5699589 queries
> and the modref part:
>   modref kill: 52 kills, 6655 queries
> So an improvemnet over 1 kill using modref I had before. Still not
> really great.
>
> Honza
>
> gcc/ChangeLog:
>
>             * ipa-modref-tree.c (modref_access_node::update_for_kills): New
>             member function.
>             (modref_access_node::merge_for_kills): Likewise.
>             (modref_access_node::insert_kill): Likewise.
>             * ipa-modref-tree.h (modref_access_node::update_for_kills,
>             modref_access_node::merge_for_kills, 
> modref_access_node::insert_kill):
>             Declare.
>             (modref_access_node::useful_for_kill): New member function.
>             * ipa-modref.c (modref_summary::useful_p): Release useless kills.
>             (lto_modref_summary): Add kills.
>             (modref_summary::dump): Dump kills.
>             (record_access): Add mdoref_access_node parameter.
>             (record_access_lto): Likewise.
>             (merge_call_side_effects): Merge kills.
>             (analyze_call): Add ALWAYS_EXECUTED param and pass it around.
>             (struct summary_ptrs): Add always_executed filed.
>             (analyze_load): Update.
>             (analyze_store): Update; record kills.
>             (analyze_stmt): Add always_executed; record kills in clobbers.
>             (analyze_function): Track always_executed.
>             (modref_summaries::duplicate): Duplicate kills.
>             (update_signature): Release kills.
>             * ipa-modref.h (struct modref_summary): Add kills.
>             * tree-ssa-alias.c (alias_stats): Add kill stats.
>             (dump_alias_stats): Dump kill stats.
>             (store_kills_ref_p): Break out from ...
>             (stmt_kills_ref_p): Use it; handle modref info based kills.
>
>     gcc/testsuite/ChangeLog:
>
>     2021-11-14  Jan Hubicka  <hubi...@ucw.cz>
>
>             * gcc.dg/tree-ssa/modref-dse-3.c: New test.
>
>
> diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
> index 6fc2b7298f4..bbe23a5a211 100644
> --- a/gcc/ipa-modref-tree.c
> +++ b/gcc/ipa-modref-tree.c
> @@ -638,6 +638,185 @@ modref_access_node::get_ao_ref (const gcall *stmt, 
> ao_ref *ref) const
>    return true;
>  }
>
> +/* Return true A is a subkill.  */
> +bool
> +modref_access_node::contains_for_kills (const modref_access_node &a) const
> +{
> +  poly_int64 aoffset_adj = 0;
> +
> +  gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
> +                      && a.parm_index != MODREF_UNKNOWN_PARM);
> +  if (parm_index != a.parm_index)
> +    return false;
> +  gcc_checking_assert (parm_offset_known && a.parm_offset_known);
> +  aoffset_adj = (a.parm_offset - parm_offset)
> +               * BITS_PER_UNIT;
> +  gcc_checking_assert (range_info_useful_p () && a.range_info_useful_p ());
> +  return known_subrange_p (a.offset + aoffset_adj,
> +                          a.max_size, offset, max_size);
> +}
> +
> +/* Merge two ranges both starting at parm_offset1 and update THIS
> +   with result.  */
> +bool
> +modref_access_node::update_for_kills (poly_int64 parm_offset1,
> +                                     poly_int64 offset1,
> +                                     poly_int64 max_size1,
> +                                     poly_int64 offset2,
> +                                     poly_int64 max_size2,
> +                                     bool record_adjustments)
> +{
> +  if (known_le (offset1, offset2))
> +    ;
> +  else if (known_le (offset2, offset1))
> +    {
> +      std::swap (offset1, offset2);
> +      std::swap (max_size1, max_size2);
> +    }
> +  else
> +    gcc_unreachable ();
> +
> +  poly_int64 new_max_size = max_size2 + offset2 - offset1;
> +  if (known_le (new_max_size, max_size1))
> +    new_max_size = max_size1;
> +  if (known_eq (parm_offset, parm_offset1)
> +      && known_eq (offset, offset1)
> +      && known_eq (size, new_max_size)
> +      && known_eq (max_size, new_max_size))
> +    return false;
> +
> +  if (!record_adjustments
> +      || (++adjustments) < param_modref_max_adjustments)
> +    {
> +      parm_offset = parm_offset1;
> +      offset = offset1;
> +      max_size = new_max_size;
> +      size = new_max_size;
> +      gcc_checking_assert (useful_for_kill_p ());
> +      return true;
> +    }
> +  return false;
> +}
> +
> +/* Merge in access A if it is possible to do without losing
> +   precision.  Return true if successful.
> +   Unlike merge assume that both accesses are always executed
> +   and merge size the same was as max_size.  */
> +bool
> +modref_access_node::merge_for_kills (const modref_access_node &a,
> +                                    bool record_adjustments)
> +{
> +  poly_int64 offset1 = 0;
> +  poly_int64 aoffset1 = 0;
> +  poly_int64 new_parm_offset = 0;
> +
> +  /* We assume that containment was tested earlier.  */
> +  gcc_checking_assert (!contains_for_kills (a) && !a.contains_for_kills 
> (*this)
> +                      && useful_for_kill_p () && a.useful_for_kill_p ());
> +
> +  if (parm_index != a.parm_index
> +      || !combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
> +    return false;
> +
> +  if (known_le (offset1, aoffset1))
> +   {
> +     if (!known_size_p (max_size)
> +        || known_ge (offset1 + max_size, aoffset1))
> +       return update_for_kills (new_parm_offset, offset1, max_size,
> +                               aoffset1, a.max_size, record_adjustments);
> +   }
> +  else if (known_le (aoffset1, offset1))
> +   {
> +     if (!known_size_p (a.max_size)
> +        || known_ge (aoffset1 + a.max_size, offset1))
> +       return update_for_kills (new_parm_offset, offset1, max_size,
> +                               aoffset1, a.max_size, record_adjustments);
> +   }
> +  return false;
> +}
> +
> +/* Insert new kill A into KILLS.  If RECORD_ADJUSTMENTS is true limit number
> +   of changes to each entry.  Return true if something changed.  */
> +
> +bool
> +modref_access_node::insert_kill (vec<modref_access_node> &kills,
> +                                modref_access_node &a, bool 
> record_adjustments)
> +{
> +  size_t index;
> +  modref_access_node *a2;
> +  bool merge = false;
> +
> +  gcc_checking_assert (a.useful_for_kill_p ());
> +
> +  /* See if we have corresponding entry already or we can merge with
> +     neighbouring entry.  */
> +  FOR_EACH_VEC_ELT (kills, index, a2)
> +    {
> +      if (a2->contains_for_kills (a))
> +       return false;
> +      if (a.contains_for_kills (*a2))
> +       {
> +         a.adjustments = 0;
> +         *a2 = a;
> +         merge = true;
> +         break;
> +       }
> +      if (a2->merge_for_kills (a, record_adjustments))
> +       {
> +         merge = true;
> +         break;
> +       }
> +    }
> +  /* If entry was not found, insert it.  */
> +  if (!merge)
> +    {
> +      if ((int)kills.length () >= param_modref_max_accesses)
> +       {
> +         if (dump_file)
> +           fprintf (dump_file,
> +                    "--param param=modref-max-accesses limit reached:");
> +         return false;
> +       }
> +      a.adjustments = 0;
> +      kills.safe_push (a);
> +      return true;
> +    }
> +  /* Extending range in an entry may make it possible to merge it with
> +     other entries.  */
> +  size_t i;
> +
> +  for (i = 0; i < kills.length ();)
> +    if (i != index)
> +      {
> +       bool found = false, restart = false;
> +       modref_access_node *a = &kills[i];
> +       modref_access_node *n = &kills[index];
> +
> +       if (n->contains_for_kills (*a))
> +         found = true;
> +       if (!found && n->merge_for_kills (*a, false))
> +         found = restart = true;
> +       gcc_checking_assert (found || !a->merge_for_kills (*n, false));
> +       if (found)
> +         {
> +           kills.unordered_remove (i);
> +           if (index == kills.length ())
> +             {
> +               index = i;
> +               i++;
> +             }
> +           if (restart)
> +             i = 0;
> +         }
> +       else
> +         i++;
> +      }
> +    else
> +      i++;
> +  return true;
> +}
> +
> +
>  #if CHECKING_P
>
>  namespace selftest {
> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
> index 2fcabe480bd..1bf2aa8460e 100644
> --- a/gcc/ipa-modref-tree.h
> +++ b/gcc/ipa-modref-tree.h
> @@ -82,6 +82,13 @@ struct GTY(()) modref_access_node
>      {
>        return parm_index != MODREF_UNKNOWN_PARM;
>      }
> +  /* Return true if access can be used to determine a kill.  */
> +  bool useful_for_kill_p () const
> +    {
> +      return parm_offset_known && parm_index != MODREF_UNKNOWN_PARM
> +            && parm_index != MODREF_RETSLOT_PARM && known_size_p (size)
> +            && known_eq (max_size, size);
> +    }
>    /* Dump range to debug OUT.  */
>    void dump (FILE *out);
>    /* Return true if both accesses are the same.  */
> @@ -98,10 +105,18 @@ struct GTY(()) modref_access_node
>    static int insert (vec <modref_access_node, va_gc> *&accesses,
>                      modref_access_node a, size_t max_accesses,
>                      bool record_adjustments);
> +  /* Same as insert but for kills where we are conservative the other way
> +     around: if information is lost, the kill is lost.  */
> +  static bool insert_kill (vec<modref_access_node> &kills,
> +                          modref_access_node &a, bool record_adjustments);
>  private:
>    bool contains (const modref_access_node &) const;
> +  bool contains_for_kills (const modref_access_node &) const;
>    void update (poly_int64, poly_int64, poly_int64, poly_int64, bool);
> +  bool update_for_kills (poly_int64, poly_int64, poly_int64,
> +                        poly_int64, poly_int64, bool);
>    bool merge (const modref_access_node &, bool);
> +  bool merge_for_kills (const modref_access_node &, bool);
>    static bool closer_pair_p (const modref_access_node &,
>                              const modref_access_node &,
>                              const modref_access_node &,
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index b75ed84135b..df4612bbff9 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -335,6 +335,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
>      return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>    if (loads && !loads->every_base)
>      return true;
> +  else
> +    kills.release ();
>    if (ecf_flags & ECF_PURE)
>      return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>    return stores && !stores->every_base;
> @@ -351,6 +353,7 @@ struct GTY(()) modref_summary_lto
>       more verbose and thus more likely to hit the limits.  */
>    modref_records_lto *loads;
>    modref_records_lto *stores;
> +  auto_vec<modref_access_node> GTY((skip)) kills;
>    auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>    eaf_flags_t retslot_flags;
>    eaf_flags_t static_chain_flags;
> @@ -570,6 +573,15 @@ modref_summary::dump (FILE *out)
>        fprintf (out, "  stores:\n");
>        dump_records (stores, out);
>      }
> +  if (kills.length ())
> +    {
> +      fprintf (out, "  kills:\n");
> +      for (auto kill : kills)
> +       {
> +         fprintf (out, "    ");
> +         kill.dump (out);
> +       }
> +    }
>    if (writes_errno)
>      fprintf (out, "  Writes errno\n");
>    if (side_effects)
> @@ -762,13 +774,12 @@ get_access (ao_ref *ref)
>  /* Record access into the modref_records data structure.  */
>
>  static void
> -record_access (modref_records *tt, ao_ref *ref)
> +record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
>  {
>    alias_set_type base_set = !flag_strict_aliasing ? 0
>                             : ao_ref_base_alias_set (ref);
>    alias_set_type ref_set = !flag_strict_aliasing ? 0
>                             : (ao_ref_alias_set (ref));
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>         fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
> @@ -781,7 +792,7 @@ record_access (modref_records *tt, ao_ref *ref)
>  /* IPA version of record_access_tree.  */
>
>  static void
> -record_access_lto (modref_records_lto *tt, ao_ref *ref)
> +record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node 
> &a)
>  {
>    /* get_alias_set sometimes use different type to compute the alias set
>       than TREE_TYPE (base).  Do same adjustments.  */
> @@ -828,7 +839,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>                        || variably_modified_type_p (ref_type, NULL_TREE)))
>         ref_type = NULL_TREE;
>      }
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>        fprintf (dump_file, "   - Recording base type:");
> @@ -932,13 +942,17 @@ bool
>  merge_call_side_effects (modref_summary *cur_summary,
>                          gimple *stmt, modref_summary *callee_summary,
>                          bool ignore_stores, cgraph_node *callee_node,
> -                        bool record_adjustments)
> +                        bool record_adjustments, bool always_executed)
>  {
>    auto_vec <modref_parm_map, 32> parm_map;
>    modref_parm_map chain_map;
>    bool changed = false;
>    int flags = gimple_call_flags (stmt);
>
> +  if ((flags & (ECF_CONST | ECF_NOVOPS))
> +      && !(flags & ECF_LOOPING_CONST_OR_PURE))
> +    return changed;
> +
>    if (!cur_summary->side_effects && callee_summary->side_effects)
>      {
>        if (dump_file)
> @@ -950,6 +964,38 @@ merge_call_side_effects (modref_summary *cur_summary,
>    if (flags & (ECF_CONST | ECF_NOVOPS))
>      return changed;
>
> +  if (always_executed
> +      && callee_summary->kills.length ()
> +      && (!cfun->can_throw_non_call_exceptions
> +         || !stmt_could_throw_p (cfun, stmt)))
> +    {
> +      /* Watch for self recursive updates.  */
> +      auto_vec<modref_access_node, 32> saved_kills;
> +
> +      saved_kills.reserve_exact (callee_summary->kills.length ());
> +      saved_kills.splice (callee_summary->kills);
> +      for (auto kill : saved_kills)
> +       {
> +         if (kill.parm_index >= (int)parm_map.length ())
> +           continue;
> +         modref_parm_map &m
> +                 = kill.parm_index == MODREF_STATIC_CHAIN_PARM
> +                   ? chain_map
                     ^^^^^^^^^^^^^^^^  chain_map  isn't initialized.

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262

> +                   : parm_map[kill.parm_index];
> +         if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
> +             || m.parm_index == MODREF_UNKNOWN_PARM
> +             || m.parm_index == MODREF_RETSLOT_PARM
> +             || !m.parm_offset_known)
> +           continue;
> +         modref_access_node n = kill;
> +         n.parm_index = m.parm_index;
> +         n.parm_offset += m.parm_offset;
> +         if (modref_access_node::insert_kill (cur_summary->kills, n,
> +                                              record_adjustments))
> +           changed = true;
> +       }
> +    }
> +

-- 
H.J.

Reply via email to