Hi! On Wed, Jan 24, 2018 at 12:27:55AM -0500, Michael Meissner wrote: > > As Segher and I were discussing over private IRC, the root cause of this bug > is > the compiler no long generates the BDNZ instruction for a count down loop, > instead it decrements the index in a GPR and does a branch/comparison on it.
Yes, ivopts makes a bad decision (it uses stride 8 for all IVs, it should keep one with stride -1 for the loop counter, for optimal code; it also does three separate increments for the three memory accesses, which is a bit excessive here). > In doing so, it now unrolls the loop twice, and and the resulting loop is too > big for the target hook TARGET_ASM_LOOP_ALIGN_MAX_SKIP. This means the loop > isn't aligned to a 32 byte boundary. It's not really unrolling, it is bb-reorder copying an RTL block. However, even if you disable it you still get 9 insns on some configurations, so your patch does not hide the problem :-( Although, hrm, in your patch you also change "int i" to "long i"; that alone seems to be enough to fix everything? Could you check that please? Segher