Hello, On Mon, Jan 05 2026, Martin Jambor wrote: > Hi, > > On Mon, Jan 05 2026, Jan Hubicka wrote: [...] >> >> Assume we have A->B->C where A sets value ranges, B can be called >> externally and C is local and would benefit from known range. >> If we clone B then we will end up >> >> A->B'->C >> B ->C >> >> and C will not see the value range passed from A? >> However if we decide to clone C for some other reason we will have >> >> A->B'->C' >> B ->C >> and here C' will see the value range and C will not? >> I do not see how you can distinguish these situation as >> ipcp_store_vr_results time? > > yes, I thought I had told you in person, the patch is wrong. I have > found out just a few days after posting it already when testing the > patch-set further. We'd need a bit in the lattice itself to distinguish > this and the effort should probably be spent making value-range lattices > for cloning. >
OK, I'd like to take this back, the patch is actually necessary for a slight improvement in 525,x264_r which I was aiming for with my patch-set. Given that we can now construct value ranges from loads from static variables, this patch makes the value to be propagated to a clone of mc_chroma which then results in better vectorization strategy, as -fdump-opt-info tells us: -x264_src/common/mc.c:282:27: optimized: loop vectorized using 64 byte vectors and unroll factor 64 +x264_src/common/mc.c:282:27: optimized: loop vectorized using 8 byte vectors and unroll factor 8 Which then means that the number of profile samples collected in that cloned function drops from 30811 to 15700 on a Zen4 machine (though the overall speedup is not big, only about 1.5%, it is consistent). So given the above, and that the original patch was (in good faith) posted before stage1 ended, I would actually like to push the fixed version which is below. I have already thought of propagating value ranges like we do constants and agree it is something that we should do, but only in stage 1 of course. Thanks and sorry for the confusion, Martin Since the IPA-CP lattices for value ranges cannot hold more values and don't have any "variable" flag, we initialize them to bottom for non-local nodes. However, that means we don't make use of known information gathered in jump functions when the corresponding node is cloned for some other reason. This patch allows collection of the information and only does not use them for the original non-local nodes, while making sure that we do not propagate information through such-non local nodes as there may be unknown calls. gcc/ChangeLog: 2026-01-06 Martin Jambor <[email protected]> * ipa-cp.h (class ipcp_bits_lattice): New members set_recipient_only, recipient_only_p and m_recipient_only. (class ipcp_vr_lattice): Likewise. (ipcp_vr_lattice::init): Initialize also m_recipient_only. * ipa-cp.cc (ipcp_bits_lattice::print): Adjust printting to also print the new flag. (ipcp_vr_lattice::print): Likewise. (ipcp_vr_lattice::set_recipient_only): New function. (ipcp_bits_lattice::set_recipient_only): Likewise. (set_all_contains_variable): New parameter MAKE_SIMPLE_RECIPIENTS, set bits and vr lattices to recibient only insted to bottom when it is true. (initialize_node_lattices): Pass true to the second parameter of set_all_contains_variable. (propagate_bits_across_jump_function): Treat recipient_only source lattices like bottom. (propagate_vr_across_jump_function): Likewise. (ipcp_store_vr_results): Skip non-local nodes. --- gcc/ipa-cp.cc | 73 +++++++++++++++++++++++++++++++++++++++++++-------- gcc/ipa-cp.h | 14 ++++++++++ 2 files changed, 76 insertions(+), 11 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index f74c80fdf3b..8c261c85c12 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -339,18 +339,25 @@ ipcp_print_widest_int (FILE *f, const widest_int &value) void ipcp_bits_lattice::print (FILE *f) { + if (bottom_p ()) + { + fprintf (f, " Bits unusable (BOTTOM)\n"); + return; + } + if (top_p ()) - fprintf (f, " Bits unknown (TOP)\n"); - else if (bottom_p ()) - fprintf (f, " Bits unusable (BOTTOM)\n"); + fprintf (f, " Bits unknown (TOP)"); else { fprintf (f, " Bits: value = "); ipcp_print_widest_int (f, get_value ()); fprintf (f, ", mask = "); ipcp_print_widest_int (f, get_mask ()); - fprintf (f, "\n"); } + + if (m_recipient_only) + fprintf (f, " (recipient only)"); + fprintf (f, "\n"); } /* Print value range lattice to F. */ @@ -358,6 +365,8 @@ ipcp_bits_lattice::print (FILE *f) void ipcp_vr_lattice::print (FILE * f) { + if (m_recipient_only) + fprintf (f, "(recipient only) "); m_vr.dump (f); } @@ -888,6 +897,18 @@ ipcp_vr_lattice::set_to_bottom () return true; } +/* Set the flag that this lattice is a recipient only, return true if it was + not set before. */ + +bool +ipcp_vr_lattice::set_recipient_only () +{ + if (m_recipient_only) + return false; + m_recipient_only = true; + return true; +} + /* Set lattice value to bottom, if it already isn't the case. */ bool @@ -924,6 +945,18 @@ ipcp_bits_lattice::known_nonzero_p () const return wi::ne_p (wi::bit_and (wi::bit_not (m_mask), m_value), 0); } +/* Set the flag that this lattice is a recipient only, return true if it was not + set before. */ + +bool +ipcp_bits_lattice::set_recipient_only () +{ + if (m_recipient_only) + return false; + m_recipient_only = true; + return true; +} + /* Convert operand to value, mask form. */ void @@ -1326,17 +1359,28 @@ intersect_argaggs_with (vec<ipa_argagg_value> &elts, } /* Mark bot aggregate and scalar lattices as containing an unknown variable, - return true is any of them has not been marked as such so far. */ + return true is any of them has not been marked as such so far. If if + MAKE_SIMPLE_RECIPIENTS is true, set the lattices that can only hold one + value to being recipients only, otherwise also set them to bottom. */ static inline bool -set_all_contains_variable (class ipcp_param_lattices *plats) +set_all_contains_variable (class ipcp_param_lattices *plats, + bool make_simple_recipients = false) { bool ret; ret = plats->itself.set_contains_variable (); ret |= plats->ctxlat.set_contains_variable (); ret |= set_agg_lats_contain_variable (plats); - ret |= plats->bits_lattice.set_to_bottom (); - ret |= plats->m_value_range.set_to_bottom (); + if (make_simple_recipients) + { + ret |= plats->bits_lattice.set_recipient_only (); + ret |= plats->m_value_range.set_recipient_only (); + } + else + { + ret |= plats->bits_lattice.set_to_bottom (); + ret |= plats->m_value_range.set_to_bottom (); + } return ret; } @@ -1481,7 +1525,7 @@ initialize_node_lattices (struct cgraph_node *node) { plats->m_value_range.init (type); if (variable) - set_all_contains_variable (plats); + set_all_contains_variable (plats, true); } } @@ -2573,7 +2617,8 @@ propagate_bits_across_jump_function (cgraph_edge *cs, int idx, result of x & 0xff == 0xff, which gets computed during ccp1 pass and we store it in jump function during analysis stage. */ - if (!src_lats->bits_lattice.bottom_p ()) + if (!src_lats->bits_lattice.bottom_p () + || src_lats->bits_lattice.recipient_only_p ()) { if (!op_type) op_type = ipa_get_type (caller_info, src_idx); @@ -2639,7 +2684,8 @@ propagate_vr_across_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc, = ipa_get_parm_lattices (caller_info, src_idx); tree operand_type = ipa_get_type (caller_info, src_idx); - if (src_lats->m_value_range.bottom_p ()) + if (src_lats->m_value_range.bottom_p () + || src_lats->m_value_range.recipient_only_p ()) return dest_lat->set_to_bottom (); if (ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR @@ -6530,6 +6576,11 @@ ipcp_store_vr_results (void) bool do_vr = true; bool do_bits = true; + /* If the function is not local, the gathered information is only useful + for clones. */ + if (!node->local) + continue; + if (!info || !opt_for_fn (node->decl, flag_ipa_vrp)) { if (dump_file) diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h index 521983d9ad9..45da483e9ab 100644 --- a/gcc/ipa-cp.h +++ b/gcc/ipa-cp.h @@ -201,6 +201,8 @@ public: bool set_to_bottom (); bool set_to_constant (widest_int, widest_int); bool known_nonzero_p () const; + bool set_recipient_only (); + bool recipient_only_p () const {return m_recipient_only; } widest_int get_value () const { return m_value; } widest_int get_mask () const { return m_mask; } @@ -216,6 +218,11 @@ private: enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val = IPA_BITS_UNDEFINED; + /* Set to true if the lattice is valid only as a recipient of propagatad + values but cannot be used as source of propagation because there may be + unknown callers. */ + bool m_recipient_only; + /* Similar to ccp_lattice_t, mask represents which bits of value are constant. If a bit in mask is set to 0, then the corresponding bit in value is known to be constant. */ @@ -231,10 +238,16 @@ class ipcp_vr_lattice { public: value_range m_vr; + /* Set to true if the lattice is valid only as a recipient of propagatad + values but cannot be used as source of propagation because there may be + unknown callers. */ + bool m_recipient_only; inline bool bottom_p () const; inline bool top_p () const; inline bool set_to_bottom (); + bool set_recipient_only (); + bool recipient_only_p () const {return m_recipient_only; } bool meet_with (const vrange &p_vr); bool meet_with (const ipcp_vr_lattice &other); void init (tree type); @@ -251,6 +264,7 @@ ipcp_vr_lattice::init (tree type) m_vr.set_type (type); // Otherwise m_vr will default to unsupported_range. + m_recipient_only = false; } /* Structure containing lattices for a parameter itself and for pieces of -- 2.52.0
