https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90595
Bug ID: 90595 Summary: LRA liveness analysis is slow Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- Split out from PR88440. As of today you'll see a big compile-time increase for compiling module_configure.fppized.f90 of 521.wrf_r at -O2. Time profile before/after: ╔══════════════════════════╤════════╤════════╤═════════╗ ║ PASS │ Before │ After │ Change ║ ╠══════════════════════════╪════════╪════════╪═════════╣ ║ backwards jump threading │ 6.29 │ 6.16 │ 97.93% ║ ║ integrated RA │ 6.76 │ 6.41 │ 94.82% ║ ║ tree SSA incremental │ 9.01 │ 11.16 │ 123.86% ║ ║ LRA create live ranges │ 15.68 │ 40.02 │ 255.23% ║ ║ PRE │ 23.24 │ 32.32 │ 139.07% ║ ║ alias stmt walking │ 27.69 │ 28.75 │ 103.83% ║ ║ phase opt and generate │ 124.13 │ 163.95 │ 132.08% ║ ║ TOTAL │ 125.39 │ 165.17 │ 131.73% ║ so LRA live ranges is already slow before. perf profiling after the change shows Samples: 579 of event 'cycles:ppp', Event count (approx.): 257134187434191 Overhead Command Shared Object Symbol 22.26% f951 f951 [.] process_bb_lives 15.06% f951 f951 [.] ix86_hard_regno_call_part_clobbered 8.55% f951 f951 [.] concat 6.88% f951 f951 [.] find_base_term 3.60% f951 f951 [.] get_ref_base_and_extent 3.27% f951 f951 [.] find_base_term 2.95% f951 f951 [.] make_hard_regno_dead which IMHO points at static inline void check_pseudos_live_through_calls (int regno, HARD_REG_SET last_call_used_reg_set, rtx_insn *call_insn) { ... for (hr = 0; HARD_REGISTER_NUM_P (hr); hr++) if (targetm.hard_regno_call_part_clobbered (call_insn, hr, PSEUDO_REGNO_MODE (regno))) add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs, PSEUDO_REGNO_MODE (regno), hr); where we do a lot of redundant work because we call this function in a loop possibly very many times with the same arguments besides regno (but with same PSEUDO_REGNO_MODE).