Hi Martin,
This patch seems to have broken one of the  AutoFDO applications we have. I 
will see if I can create a standalone test.  But, ….

> On 8 Jan 2026, at 3:50 am, Martin Jambor <[email protected]> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Hello,
> 
> On Mon, Jan 05 2026, Martin Jambor wrote:
>> Hi,
>> 
>> On Mon, Jan 05 2026, Jan Hubicka wrote:
> [...]
>>> 
>>> Assume we have A->B->C where A sets value ranges, B can be called
>>> externally and C is local and would benefit from known range.
>>> If we clone B then we will end up
>>> 
>>> A->B'->C
>>>   B ->C
>>> 
>>> and C will not see the value range passed from A?
>>> However if we decide to clone C for some other reason we will have
>>> 
>>> A->B'->C'
>>>   B ->C
>>> and here C' will see the value range and C will not?
>>> I do not see how you can distinguish these situation as
>>> ipcp_store_vr_results time?
>> 
>> yes, I thought I had told you in person, the patch is wrong.  I have
>> found out just a few days after posting it already when testing the
>> patch-set further.  We'd need a bit in the lattice itself to distinguish
>> this and the effort should probably be spent making value-range lattices
>> for cloning.
>> 
> 
> OK, I'd like to take this back, the patch is actually necessary for a
> slight improvement in 525,x264_r which I was aiming for with my
> patch-set.
> 
> Given that we can now construct value ranges from loads from static
> variables, this patch makes the value to be propagated to a clone of
> mc_chroma which then results in better vectorization strategy, as
> -fdump-opt-info tells us:
> 
> -x264_src/common/mc.c:282:27: optimized: loop vectorized using 64 byte 
> vectors and unroll factor 64
> +x264_src/common/mc.c:282:27: optimized: loop vectorized using 8 byte vectors 
> and unroll factor 8
> 
> Which then means that the number of profile samples collected in that
> cloned function drops from 30811 to 15700 on a Zen4 machine (though the
> overall speedup is not big, only about 1.5%, it is consistent).
> 
> So given the above, and that the original patch was (in good faith)
> posted before stage1 ended, I would actually like to push the fixed
> version which is below.  I have already thought of propagating value
> ranges like we do constants and agree it is something that we should do,
> but only in stage 1 of course.
> 
> Thanks and sorry for the confusion,
> 
> Martin
> 
> 
> 
> Since the IPA-CP lattices for value ranges cannot hold more values and
> don't have any "variable" flag, we initialize them to bottom for
> non-local nodes.  However, that means we don't make use of known
> information gathered in jump functions when the corresponding node is
> cloned for some other reason.  This patch allows collection of the
> information and only does not use them for the original non-local
> nodes, while making sure that we do not propagate information through
> such-non local nodes as there may be unknown calls.
> 
> gcc/ChangeLog:
> 
> 2026-01-06  Martin Jambor  <[email protected]>
> 
>        * ipa-cp.h (class ipcp_bits_lattice): New members set_recipient_only,
>        recipient_only_p and m_recipient_only.
>        (class ipcp_vr_lattice): Likewise.
>        (ipcp_vr_lattice::init): Initialize also m_recipient_only.
>        * ipa-cp.cc (ipcp_bits_lattice::print): Adjust printting to also
>        print the new flag.
>        (ipcp_vr_lattice::print): Likewise.
>        (ipcp_vr_lattice::set_recipient_only): New function.
>        (ipcp_bits_lattice::set_recipient_only): Likewise.
>        (set_all_contains_variable): New parameter MAKE_SIMPLE_RECIPIENTS, set
>        bits and vr lattices to recibient only insted to bottom when it is
>        true.
>        (initialize_node_lattices): Pass true to the second parameter of
>        set_all_contains_variable.
>        (propagate_bits_across_jump_function): Treat recipient_only source
>        lattices like bottom.
>        (propagate_vr_across_jump_function): Likewise.
>        (ipcp_store_vr_results): Skip non-local nodes.
> ---
> gcc/ipa-cp.cc | 73 +++++++++++++++++++++++++++++++++++++++++++--------
> gcc/ipa-cp.h  | 14 ++++++++++
> 2 files changed, 76 insertions(+), 11 deletions(-)
> 
> diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
> index f74c80fdf3b..8c261c85c12 100644
> --- a/gcc/ipa-cp.cc
> +++ b/gcc/ipa-cp.cc
> @@ -339,18 +339,25 @@ ipcp_print_widest_int (FILE *f, const widest_int &value)
> void
> ipcp_bits_lattice::print (FILE *f)
> {
> +  if (bottom_p ())
> +    {
> +      fprintf (f, "         Bits unusable (BOTTOM)\n");
> +      return;
> +    }
> +
>   if (top_p ())
> -    fprintf (f, "         Bits unknown (TOP)\n");
> -  else if (bottom_p ())
> -    fprintf (f, "         Bits unusable (BOTTOM)\n");
> +    fprintf (f, "         Bits unknown (TOP)");
>   else
>     {
>       fprintf (f, "         Bits: value = ");
>       ipcp_print_widest_int (f, get_value ());
>       fprintf (f, ", mask = ");
>       ipcp_print_widest_int (f, get_mask ());
> -      fprintf (f, "\n");
>     }
> +
> +  if (m_recipient_only)
> +    fprintf (f, " (recipient only)");
> +  fprintf (f, "\n");
> }
> 
> /* Print value range lattice to F.  */
> @@ -358,6 +365,8 @@ ipcp_bits_lattice::print (FILE *f)
> void
> ipcp_vr_lattice::print (FILE * f)
> {
> +  if (m_recipient_only)
> +    fprintf (f, "(recipient only) ");
>   m_vr.dump (f);
> }
> 
> @@ -888,6 +897,18 @@ ipcp_vr_lattice::set_to_bottom ()
>   return true;
> }
> 
> +/* Set the flag that this lattice is a recipient only, return true if it was
> +   not set before.  */
> +
> +bool
> +ipcp_vr_lattice::set_recipient_only ()
> +{
> +  if (m_recipient_only)
> +    return false;
> +  m_recipient_only = true;
> +  return true;
> +}
> +
> /* Set lattice value to bottom, if it already isn't the case.  */
> 
> bool
> @@ -924,6 +945,18 @@ ipcp_bits_lattice::known_nonzero_p () const
>   return wi::ne_p (wi::bit_and (wi::bit_not (m_mask), m_value), 0);
> }
> 
> +/* Set the flag that this lattice is a recipient only, return true if it was 
> not
> +   set before.  */
> +
> +bool
> +ipcp_bits_lattice::set_recipient_only ()
> +{
> +  if (m_recipient_only)
> +    return false;
> +  m_recipient_only = true;
> +  return true;
> +}
> +
> /* Convert operand to value, mask form.  */
> 
> void
> @@ -1326,17 +1359,28 @@ intersect_argaggs_with (vec<ipa_argagg_value> &elts,
> }
> 
> /* Mark bot aggregate and scalar lattices as containing an unknown variable,
> -   return true is any of them has not been marked as such so far.  */
> +   return true is any of them has not been marked as such so far.  If if
> +   MAKE_SIMPLE_RECIPIENTS is true, set the lattices that can only hold one
> +   value to being recipients only, otherwise also set them to bottom.  */
> 
> static inline bool
> -set_all_contains_variable (class ipcp_param_lattices *plats)
> +set_all_contains_variable (class ipcp_param_lattices *plats,
> +                          bool make_simple_recipients = false)
> {
>   bool ret;
>   ret = plats->itself.set_contains_variable ();
>   ret |= plats->ctxlat.set_contains_variable ();
>   ret |= set_agg_lats_contain_variable (plats);
> -  ret |= plats->bits_lattice.set_to_bottom ();
> -  ret |= plats->m_value_range.set_to_bottom ();
> +  if (make_simple_recipients)
> +    {
> +      ret |= plats->bits_lattice.set_recipient_only ();
> +      ret |= plats->m_value_range.set_recipient_only ();
> +    }
> +  else
> +    {
> +      ret |= plats->bits_lattice.set_to_bottom ();
> +      ret |= plats->m_value_range.set_to_bottom ();
> +    }
>   return ret;
> }


Here, we seem to be treating  m_recipient_only and bits_lattice separately.  In 
this case,  shouldn’t we set  bits_lattice and m_value_range too for the case 
make_simple_recipients is true?

Thanks,
Kugan


> 
> @@ -1481,7 +1525,7 @@ initialize_node_lattices (struct cgraph_node *node)
>        {
>          plats->m_value_range.init (type);
>          if (variable)
> -           set_all_contains_variable (plats);
> +           set_all_contains_variable (plats, true);
>        }
>     }
> 
> @@ -2573,7 +2617,8 @@ propagate_bits_across_jump_function (cgraph_edge *cs, 
> int idx,
>         result of x & 0xff == 0xff, which gets computed during ccp1 pass
>         and we store it in jump function during analysis stage.  */
> 
> -      if (!src_lats->bits_lattice.bottom_p ())
> +      if (!src_lats->bits_lattice.bottom_p ()
> +         || src_lats->bits_lattice.recipient_only_p ())
>        {
>          if (!op_type)
>            op_type = ipa_get_type (caller_info, src_idx);
> @@ -2639,7 +2684,8 @@ propagate_vr_across_jump_function (cgraph_edge *cs, 
> ipa_jump_func *jfunc,
>        = ipa_get_parm_lattices (caller_info, src_idx);
>       tree operand_type = ipa_get_type (caller_info, src_idx);
> 
> -      if (src_lats->m_value_range.bottom_p ())
> +      if (src_lats->m_value_range.bottom_p ()
> +         || src_lats->m_value_range.recipient_only_p ())
>        return dest_lat->set_to_bottom ();
> 
>       if (ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR
> @@ -6530,6 +6576,11 @@ ipcp_store_vr_results (void)
>       bool do_vr = true;
>       bool do_bits = true;
> 
> +      /* If the function is not local, the gathered information is only 
> useful
> +        for clones.  */
> +      if (!node->local)
> +       continue;
> +
>       if (!info || !opt_for_fn (node->decl, flag_ipa_vrp))
>        {
>          if (dump_file)
> diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h
> index 521983d9ad9..45da483e9ab 100644
> --- a/gcc/ipa-cp.h
> +++ b/gcc/ipa-cp.h
> @@ -201,6 +201,8 @@ public:
>   bool set_to_bottom ();
>   bool set_to_constant (widest_int, widest_int);
>   bool known_nonzero_p () const;
> +  bool set_recipient_only ();
> +  bool recipient_only_p () const {return m_recipient_only; }
> 
>   widest_int get_value () const { return m_value; }
>   widest_int get_mask () const { return m_mask; }
> @@ -216,6 +218,11 @@ private:
>   enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING }
>     m_lattice_val = IPA_BITS_UNDEFINED;
> 
> +  /* Set to true if the lattice is valid only as a recipient of propagatad
> +     values but cannot be used as source of propagation because there may be
> +     unknown callers.  */
> +  bool m_recipient_only;
> +
>   /* Similar to ccp_lattice_t, mask represents which bits of value are 
> constant.
>      If a bit in mask is set to 0, then the corresponding bit in
>      value is known to be constant.  */
> @@ -231,10 +238,16 @@ class ipcp_vr_lattice
> {
> public:
>   value_range m_vr;
> +  /* Set to true if the lattice is valid only as a recipient of propagatad
> +     values but cannot be used as source of propagation because there may be
> +     unknown callers.  */
> +  bool m_recipient_only;
> 
>   inline bool bottom_p () const;
>   inline bool top_p () const;
>   inline bool set_to_bottom ();
> +  bool set_recipient_only ();
> +  bool recipient_only_p () const {return m_recipient_only; }
>   bool meet_with (const vrange &p_vr);
>   bool meet_with (const ipcp_vr_lattice &other);
>   void init (tree type);
> @@ -251,6 +264,7 @@ ipcp_vr_lattice::init (tree type)
>     m_vr.set_type (type);
> 
>   // Otherwise m_vr will default to unsupported_range.
> +  m_recipient_only = false;
> }
> 
> /* Structure containing lattices for a parameter itself and for pieces of
> --
> 2.52.0


Reply via email to