On Fri, Apr 5, 2024 at 11:29 PM Segher Boessenkool
<seg...@kernel.crashing.org> wrote:
>
> Hi!
>
> On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote:
> > The following avoids re-walking and re-combining the instructions
> > between i2 and i3 when the pattern of i2 doesn't change.
> >
> > Bootstrap and regtest running ontop of a reversal of
> > r14-9692-g839bc42772ba7a.
>
> Please include that in the patch (or series, preferably).

I'd like reversal to be considered independently of this patch which is why
I didn't include the reversal.  Of course without reversal this patch doesn't
make sense.

> > It brings down memory use frmo 9GB to 400MB and compile-time from
> > 80s to 3.5s.  r14-9692-g839bc42772ba7a does better in both metrics
> > but has shown code generation regressions across acrchitectures.
> >
> > OK to revert r14-9692-g839bc42772ba7a?
>
> No.
>
> The patch solved a very real problem.  How does your replacement handle
> that?  You don't say.  It looks like it only battles symptoms a bit,
> instead :-(

My patch isn't a replacement for your solution.  Reversal is to address
the P1 regressions caused by the change.  My change offers to address
some of the symptoms shown with the testcase without disabling the
offending 2->2 combinations.

> We had this before: 3->2 combinations that leave an instruction
> identical to what was there before.  This was just a combination with
> context as well.  The only reason this wasn't a huge problem then
> already was because this is a 3->2 combination, even if it really is a
> 2->1 one it still is beneficial in all the same cases.  But in the new
> case it can iterate indefinitely -- well not quite, but some polynomial
> number of times, for a polynomial at least of degree three, possibly
> more :-(
>
> With this patch you need to show combine still is linear.  I don't think
> it is, but some deeper analysis might show it still is.

We have come to different conclusions as to what the issue is that the
testcase exposes in combine.  While you spotted a transform that
combine shouldn't have done (and I think I agree on that!) I identified
the fact that while the number of successful combines is linear in the
number of log-links (and thus linear in the size of a basic-block), the
number of combination _attempts_ appears to be O(log-links) times
O(successful combinations).  The testcase hits hard on this because of
those 2->2 combinations done but this problem persists with all N->2
combinations.  This "quadraticness" (I know you don't like to call it that
way) is because for each successful N->2 combination of insns
{I2, .. <n intervening insns> .., I1} we try combining into all insns
between (and including) I2 and I1.  combine wants to retry combining
into I2 rightfully so but there's no good reason to retry _all_ of the
n intervening insns.  Yes, we want to retry all insns that have their
log-links point to I2 (and possibly more for second-level references,
dependent on how combine exactly identifies I2, I3 and I4).  Re-trying
all of them obviously works, but there's your quadraticness.

My patch makes this problem a little bit (in general) less often hit
since it avoids retrying iff I2 did not change.  I _think_ in that case
the log-links/notes shouldn't have changed either.  In any case I'm
disabling less of combine than what your patch did.

So again - is it OK to revert your patch?  Or do you expect you can
address the code generation regressions within the next week?  Since
the original issue is quite old postponing your solution to the next stage1
should be acceptable.  Currently those regressions will block the release
of GCC 14.

Do you agree that the patch I propose, while it doesn't solve any actual
issue (it doesn't fix the walking scheme nor does it avoid the combinations
combine shouldn't do), it helps in some cases and shouldn't cause code
generation regressions that your patch wouldn't have caused as well?
So is that change OK until we get your real solution implemented in a way
that doesn't cause regressions?

Thanks,
Richard.

Reply via email to