On Thu, Feb 16, 2023 at 3:34 PM Andrew MacLeod <amacl...@redhat.com> wrote: > > > On 2/16/23 02:55, Richard Biener wrote: > > On Wed, Feb 15, 2023 at 6:07 PM Andrew MacLeod via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> This patch implements the suggestion that we have an alternative > >> ssa-cache which does not zero memory, and instead uses a bitmap to track > >> whether a value is currently set or not. It roughly mimics what > >> path_range_query was doing internally. > >> > >> For sparsely used cases, expecially in large programs, this is more > >> efficient. I changed path_range_query to use this, and removed it old > >> bitmap (and a hack or two around PHI calculations), and also utilized > >> this is the assume_query class. > >> > >> Performance wise, the patch doesn't affect VRP (since that still uses > >> the original version). Switching to the lazy version caused a slowdown > >> of 2.5% across VRP. > >> > >> There was a noticeable improvement elsewhere., across 230 GCC source > >> files, threading ran over 12% faster!. Overall compilation improved by > >> 0.3% Not sure it makes much difference in compiler.i, but it shouldn't > >> hurt. > >> > >> bootstraps on x86_64-pc-linux-gnu with no regressions. OK for trunk? > >> or do you want to wait for the next release... > > I see > > > > @@ -365,16 +335,8 @@ path_range_query::compute_ranges_in_phis (basic_block > > bb) > > > > Value_Range r (TREE_TYPE (name)); > > if (range_defined_in_block (r, name, bb)) > > - { > > - unsigned v = SSA_NAME_VERSION (name); > > - set_cache (r, name); > > - bitmap_set_bit (phi_set, v); > > - // Pretend we don't have a cache entry for this name until > > - // we're done with all PHIs. > > - bitmap_clear_bit (m_has_cache_entry, v); > > - } > > + m_cache.set_global_range (name, r); > > } > > - bitmap_ior_into (m_has_cache_entry, phi_set); > > } > > > > // Return TRUE if relations may be invalidated after crossing edge E. > > > > which I think is not correct - if we have > > > > # _1 = PHI <..., _2> > > # _2 = PHI <..., _1> > > > > then their effects are supposed to be executed in parallel, that is, > > both PHI argument _2 and _1 are supposed to see the "old" version. > > The previous code tried to make sure the range of the new _1 doesn't > > get seen when processing the argument _1 in the definition of _2. > > > > The new version drops this, possibly resulting in wrong-code. > > This is dropped because it is actually handled properly in > range_defined_in_block now. (which I think Aldy was describing). > > It didnt make sense to me why it was handled here like this, so I traced > through the call chain to find out if it was still actually needed and > discussed it with Aldy. I think it was mostly a leftover wart.
Ah, thanks for checking. > > > > While I think it's appropriate to sort out compile-time issues like this > > during stage4 at least the above makes me think it should be defered > > to next stage1. > > I am happy to defer it since its a marginal increase anyway. Sure - thus OK for stage1. Thanks, Richard. > > Andrew > >