Hi Martin, This patch seems to have broken one of the AutoFDO applications we have. I will see if I can create a standalone test. But, ….
> On 8 Jan 2026, at 3:50 am, Martin Jambor <[email protected]> wrote: > > External email: Use caution opening links or attachments > > > Hello, > > On Mon, Jan 05 2026, Martin Jambor wrote: >> Hi, >> >> On Mon, Jan 05 2026, Jan Hubicka wrote: > [...] >>> >>> Assume we have A->B->C where A sets value ranges, B can be called >>> externally and C is local and would benefit from known range. >>> If we clone B then we will end up >>> >>> A->B'->C >>> B ->C >>> >>> and C will not see the value range passed from A? >>> However if we decide to clone C for some other reason we will have >>> >>> A->B'->C' >>> B ->C >>> and here C' will see the value range and C will not? >>> I do not see how you can distinguish these situation as >>> ipcp_store_vr_results time? >> >> yes, I thought I had told you in person, the patch is wrong. I have >> found out just a few days after posting it already when testing the >> patch-set further. We'd need a bit in the lattice itself to distinguish >> this and the effort should probably be spent making value-range lattices >> for cloning. >> > > OK, I'd like to take this back, the patch is actually necessary for a > slight improvement in 525,x264_r which I was aiming for with my > patch-set. > > Given that we can now construct value ranges from loads from static > variables, this patch makes the value to be propagated to a clone of > mc_chroma which then results in better vectorization strategy, as > -fdump-opt-info tells us: > > -x264_src/common/mc.c:282:27: optimized: loop vectorized using 64 byte > vectors and unroll factor 64 > +x264_src/common/mc.c:282:27: optimized: loop vectorized using 8 byte vectors > and unroll factor 8 > > Which then means that the number of profile samples collected in that > cloned function drops from 30811 to 15700 on a Zen4 machine (though the > overall speedup is not big, only about 1.5%, it is consistent). > > So given the above, and that the original patch was (in good faith) > posted before stage1 ended, I would actually like to push the fixed > version which is below. I have already thought of propagating value > ranges like we do constants and agree it is something that we should do, > but only in stage 1 of course. > > Thanks and sorry for the confusion, > > Martin > > > > Since the IPA-CP lattices for value ranges cannot hold more values and > don't have any "variable" flag, we initialize them to bottom for > non-local nodes. However, that means we don't make use of known > information gathered in jump functions when the corresponding node is > cloned for some other reason. This patch allows collection of the > information and only does not use them for the original non-local > nodes, while making sure that we do not propagate information through > such-non local nodes as there may be unknown calls. > > gcc/ChangeLog: > > 2026-01-06 Martin Jambor <[email protected]> > > * ipa-cp.h (class ipcp_bits_lattice): New members set_recipient_only, > recipient_only_p and m_recipient_only. > (class ipcp_vr_lattice): Likewise. > (ipcp_vr_lattice::init): Initialize also m_recipient_only. > * ipa-cp.cc (ipcp_bits_lattice::print): Adjust printting to also > print the new flag. > (ipcp_vr_lattice::print): Likewise. > (ipcp_vr_lattice::set_recipient_only): New function. > (ipcp_bits_lattice::set_recipient_only): Likewise. > (set_all_contains_variable): New parameter MAKE_SIMPLE_RECIPIENTS, set > bits and vr lattices to recibient only insted to bottom when it is > true. > (initialize_node_lattices): Pass true to the second parameter of > set_all_contains_variable. > (propagate_bits_across_jump_function): Treat recipient_only source > lattices like bottom. > (propagate_vr_across_jump_function): Likewise. > (ipcp_store_vr_results): Skip non-local nodes. > --- > gcc/ipa-cp.cc | 73 +++++++++++++++++++++++++++++++++++++++++++-------- > gcc/ipa-cp.h | 14 ++++++++++ > 2 files changed, 76 insertions(+), 11 deletions(-) > > diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc > index f74c80fdf3b..8c261c85c12 100644 > --- a/gcc/ipa-cp.cc > +++ b/gcc/ipa-cp.cc > @@ -339,18 +339,25 @@ ipcp_print_widest_int (FILE *f, const widest_int &value) > void > ipcp_bits_lattice::print (FILE *f) > { > + if (bottom_p ()) > + { > + fprintf (f, " Bits unusable (BOTTOM)\n"); > + return; > + } > + > if (top_p ()) > - fprintf (f, " Bits unknown (TOP)\n"); > - else if (bottom_p ()) > - fprintf (f, " Bits unusable (BOTTOM)\n"); > + fprintf (f, " Bits unknown (TOP)"); > else > { > fprintf (f, " Bits: value = "); > ipcp_print_widest_int (f, get_value ()); > fprintf (f, ", mask = "); > ipcp_print_widest_int (f, get_mask ()); > - fprintf (f, "\n"); > } > + > + if (m_recipient_only) > + fprintf (f, " (recipient only)"); > + fprintf (f, "\n"); > } > > /* Print value range lattice to F. */ > @@ -358,6 +365,8 @@ ipcp_bits_lattice::print (FILE *f) > void > ipcp_vr_lattice::print (FILE * f) > { > + if (m_recipient_only) > + fprintf (f, "(recipient only) "); > m_vr.dump (f); > } > > @@ -888,6 +897,18 @@ ipcp_vr_lattice::set_to_bottom () > return true; > } > > +/* Set the flag that this lattice is a recipient only, return true if it was > + not set before. */ > + > +bool > +ipcp_vr_lattice::set_recipient_only () > +{ > + if (m_recipient_only) > + return false; > + m_recipient_only = true; > + return true; > +} > + > /* Set lattice value to bottom, if it already isn't the case. */ > > bool > @@ -924,6 +945,18 @@ ipcp_bits_lattice::known_nonzero_p () const > return wi::ne_p (wi::bit_and (wi::bit_not (m_mask), m_value), 0); > } > > +/* Set the flag that this lattice is a recipient only, return true if it was > not > + set before. */ > + > +bool > +ipcp_bits_lattice::set_recipient_only () > +{ > + if (m_recipient_only) > + return false; > + m_recipient_only = true; > + return true; > +} > + > /* Convert operand to value, mask form. */ > > void > @@ -1326,17 +1359,28 @@ intersect_argaggs_with (vec<ipa_argagg_value> &elts, > } > > /* Mark bot aggregate and scalar lattices as containing an unknown variable, > - return true is any of them has not been marked as such so far. */ > + return true is any of them has not been marked as such so far. If if > + MAKE_SIMPLE_RECIPIENTS is true, set the lattices that can only hold one > + value to being recipients only, otherwise also set them to bottom. */ > > static inline bool > -set_all_contains_variable (class ipcp_param_lattices *plats) > +set_all_contains_variable (class ipcp_param_lattices *plats, > + bool make_simple_recipients = false) > { > bool ret; > ret = plats->itself.set_contains_variable (); > ret |= plats->ctxlat.set_contains_variable (); > ret |= set_agg_lats_contain_variable (plats); > - ret |= plats->bits_lattice.set_to_bottom (); > - ret |= plats->m_value_range.set_to_bottom (); > + if (make_simple_recipients) > + { > + ret |= plats->bits_lattice.set_recipient_only (); > + ret |= plats->m_value_range.set_recipient_only (); > + } > + else > + { > + ret |= plats->bits_lattice.set_to_bottom (); > + ret |= plats->m_value_range.set_to_bottom (); > + } > return ret; > } Here, we seem to be treating m_recipient_only and bits_lattice separately. In this case, shouldn’t we set bits_lattice and m_value_range too for the case make_simple_recipients is true? Thanks, Kugan > > @@ -1481,7 +1525,7 @@ initialize_node_lattices (struct cgraph_node *node) > { > plats->m_value_range.init (type); > if (variable) > - set_all_contains_variable (plats); > + set_all_contains_variable (plats, true); > } > } > > @@ -2573,7 +2617,8 @@ propagate_bits_across_jump_function (cgraph_edge *cs, > int idx, > result of x & 0xff == 0xff, which gets computed during ccp1 pass > and we store it in jump function during analysis stage. */ > > - if (!src_lats->bits_lattice.bottom_p ()) > + if (!src_lats->bits_lattice.bottom_p () > + || src_lats->bits_lattice.recipient_only_p ()) > { > if (!op_type) > op_type = ipa_get_type (caller_info, src_idx); > @@ -2639,7 +2684,8 @@ propagate_vr_across_jump_function (cgraph_edge *cs, > ipa_jump_func *jfunc, > = ipa_get_parm_lattices (caller_info, src_idx); > tree operand_type = ipa_get_type (caller_info, src_idx); > > - if (src_lats->m_value_range.bottom_p ()) > + if (src_lats->m_value_range.bottom_p () > + || src_lats->m_value_range.recipient_only_p ()) > return dest_lat->set_to_bottom (); > > if (ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR > @@ -6530,6 +6576,11 @@ ipcp_store_vr_results (void) > bool do_vr = true; > bool do_bits = true; > > + /* If the function is not local, the gathered information is only > useful > + for clones. */ > + if (!node->local) > + continue; > + > if (!info || !opt_for_fn (node->decl, flag_ipa_vrp)) > { > if (dump_file) > diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h > index 521983d9ad9..45da483e9ab 100644 > --- a/gcc/ipa-cp.h > +++ b/gcc/ipa-cp.h > @@ -201,6 +201,8 @@ public: > bool set_to_bottom (); > bool set_to_constant (widest_int, widest_int); > bool known_nonzero_p () const; > + bool set_recipient_only (); > + bool recipient_only_p () const {return m_recipient_only; } > > widest_int get_value () const { return m_value; } > widest_int get_mask () const { return m_mask; } > @@ -216,6 +218,11 @@ private: > enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } > m_lattice_val = IPA_BITS_UNDEFINED; > > + /* Set to true if the lattice is valid only as a recipient of propagatad > + values but cannot be used as source of propagation because there may be > + unknown callers. */ > + bool m_recipient_only; > + > /* Similar to ccp_lattice_t, mask represents which bits of value are > constant. > If a bit in mask is set to 0, then the corresponding bit in > value is known to be constant. */ > @@ -231,10 +238,16 @@ class ipcp_vr_lattice > { > public: > value_range m_vr; > + /* Set to true if the lattice is valid only as a recipient of propagatad > + values but cannot be used as source of propagation because there may be > + unknown callers. */ > + bool m_recipient_only; > > inline bool bottom_p () const; > inline bool top_p () const; > inline bool set_to_bottom (); > + bool set_recipient_only (); > + bool recipient_only_p () const {return m_recipient_only; } > bool meet_with (const vrange &p_vr); > bool meet_with (const ipcp_vr_lattice &other); > void init (tree type); > @@ -251,6 +264,7 @@ ipcp_vr_lattice::init (tree type) > m_vr.set_type (type); > > // Otherwise m_vr will default to unsupported_range. > + m_recipient_only = false; > } > > /* Structure containing lattices for a parameter itself and for pieces of > -- > 2.52.0
