https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67732
--- Comment #2 from Kazumoto Kojima <kkojima at gcc dot gnu.org> --- (In reply to Oleg Endo from comment #1) > There are some improvements with LRA, but the regression in teem-1.6.0-src > > total: 1105728 -> 1122288 +16560 / +1.497656 % > > outweighs all of the improvements. teem shows problems with -mlra. I'm not sure whether all of them are pure RA issues or not, though. Typical inflation with LRA in teem would be oparations against float matrices. An example void foo (float mat[]) { float *p = &mat[16]; *--p = 0.0; *--p = 0.0; *--p = 0.0; *--p = 0.0; *--p = 0.0; *--p = 0.0; *--p = 0.0; *--p = 0.0; } shows ~20% code size regression with -O2. At .ira, it looks like: r165 := 0.0 ... r167 := r162 + 56 mem[r167] := r165 r170 := r162 + 52 mem[r170] := r165 r173 := r162 + 48 mem[r173] := r165 ... and the old reload generates: fr1 := 0.0 ... r1 := r4 r1 := r1 + 56 mem[r1] := fr1 r1 := r4 r1 := r1 + 52 mem[r1] := fr1 r1 := r4 r1 := r1 + 48 mem[r1] := fr1 ... Then postreload fixes it up to fr1 := 0.0 ... r1 := r1 + 56 mem[r1] := fr1 r1 := r1 - 4 mem[r1] := fr1 r1 := r1 - 4 mem[r1] := fr1 ... OTOH, LRA generates: fr1 := 0.0 ... r1 := r4 + 56 mem[r1] := fr1 r1 := r4 + 52 mem[r1] := fr1 r1 := r4 + 48 mem[r1] := fr1 ... Unfortunately postreload doesn't fix it up and split2 makes it into the similar code generated with the old reload. I've run postreload_cse again just after pass_split_after_reload and got same text size with the old reload for the above example at -mlra. The CSiBE total sizes are: 3303471 -mno-lra without 2nd postreload_cse 3315689 -mlra without 2nd postreload_cse 3293271 -mno-lra with 2nd postreload_cse 3299813 -mlra with 2nd postreload_cse and teem only comparison is: 1079932 -mno-lra without 2nd postreload_cse 1094468 -mlra without 2nd postreload_cse 1073104 -mno-lra with 2nd postreload_cse 1082764 -mlra with 2nd postreload_cse