https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832
--- Comment #16 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Kewen Lin <[email protected]>: https://gcc.gnu.org/g:c776dcd5f868a16661b86842916493b531988d1e commit r17-258-gc776dcd5f868a16661b86842916493b531988d1e Author: Kewen Lin <[email protected]> Date: Fri May 1 13:50:57 2026 +0000 i386: Adjust some c86-4g*.md modeling to reduce build time Commit r17-203 caused significant increase in GCC build time on several environments as folks reported, mainly due to excessively long execution time of genautomata. As Alexander pointed out, the current division modeling in c86-4g*.md can cause a combinatorial explosion in the automaton, that further leads to significant build time increase. Following Alexander's suggestion, this patch introduces the dedicated automatons and cpu_units for idiv and fdiv, uses them to updates the integer, floating point division and square root modeling for now. Some evaluated statistics are listed below. With r17-202: *Tested stage-1 i686 build -j 32: 255 seconds* $ nm -CS -t d --defined-only gcc/insn-automata.o \ | sed 's/^[0-9]* 0*//' \ | sort -n | tail -20 13896 r slm_transitions 15360 r znver4_fp_store_transitions 16760 r znver4_ieu_transitions 17776 r bdver1_ieu_transitions 20068 r bdver1_fp_check 20068 r bdver1_fp_transitions 20983 t internal_state_transition(int, DFA_chip*) 22270 t internal_min_issue_delay(int, DFA_chip*) 26208 r slm_min_issue_delay 27244 r bdver1_fp_min_issue_delay 28518 r glm_check 28518 r glm_transitions 33690 r geode_min_issue_delay 45436 r znver4_fpu_min_issue_delay 46980 r bdver3_fp_min_issue_delay 49428 r glm_min_issue_delay 53730 r btver2_fp_min_issue_delay 53760 r znver1_fp_transitions 93960 r bdver3_fp_transitions 181744 r znver4_fpu_transitions With culprit commit r17-203: *Tested stage-1 i686 build -j 32: 949 seconds* $ nm -CS -t d --defined-only gcc/insn-automata.o \ | sed 's/^[0-9]* 0*//' \ | sort -n | tail -20 28518 r glm_check 28518 r glm_transitions 33690 r geode_min_issue_delay 45436 r znver4_fpu_min_issue_delay 46980 r bdver3_fp_min_issue_delay 49428 r glm_min_issue_delay 53730 r btver2_fp_min_issue_delay 53760 r znver1_fp_transitions 68160 r c86_4g_ieu_min_issue_delay 93960 r bdver3_fp_transitions 110080 r c86_4g_fp_min_issue_delay 136320 r c86_4g_ieu_transitions 181744 r znver4_fpu_transitions 220160 r c86_4g_fp_transitions 262988 r c86_4g_m7_fpu_base 475225 r c86_4g_m7_ieu_min_issue_delay 950450 r c86_4g_m7_ieu_transitions 4010567 r c86_4g_m7_fpu_min_issue_delay 5496908 r c86_4g_m7_fpu_check 5496908 r c86_4g_m7_fpu_transitions With this patch: *Tested stage-1 i686 build -j 32: 257 seconds* $ nm -CS -t d --defined-only gcc/insn-automata.o \ | sed 's/^[0-9]* 0*//' \ | sort -n | tail -20 20068 r bdver1_fp_transitions 22354 r c86_4g_m7_ieu_min_issue_delay 25705 t internal_state_transition(int, DFA_chip*) 26208 r slm_min_issue_delay 27164 t internal_min_issue_delay(int, DFA_chip*) 27244 r bdver1_fp_min_issue_delay 28518 r glm_check 28518 r glm_transitions 33690 r geode_min_issue_delay 33728 r c86_4g_fp_transitions 45436 r znver4_fpu_min_issue_delay 46980 r bdver3_fp_min_issue_delay 49428 r glm_min_issue_delay 53730 r btver2_fp_min_issue_delay 53760 r znver1_fp_transitions 89414 r c86_4g_m7_ieu_transitions 93960 r bdver3_fp_transitions 181744 r znver4_fpu_transitions 326322 r c86_4g_m7_fpu_min_issue_delay 1305288 r c86_4g_m7_fpu_transitions I noticed the number of c86_4g_m7_fpu_transitions is still large, but this patch can address the build time issue. To avoid impacting folks' daily builds and regular testings, I'd like to land this patch first if possible. We can then further refine the c86-4g modeling and investigate large transition count as part of the follow-up work, even potentially part of PR 87832. gcc/ChangeLog: * config/i386/c86-4g-m7.md (c86_4g_m7_idiv): New automaton. (c86_4g_m7_fdiv): Ditto. (c86-4g-m7-idiv): New unit. (c86-4g-m7-fdiv): Ditto. (c86_4g_m7_idiv_DI): Adjust unit in the reservation. (c86_4g_m7_idiv_SI): Ditto. (c86_4g_m7_idiv_HI): Ditto. (c86_4g_m7_idiv_QI): Ditto. (c86_4g_m7_idiv_DI_load): Ditto. (c86_4g_m7_idiv_SI_load): Ditto. (c86_4g_m7_idiv_HI_load): Ditto. (c86_4g_m7_idiv_QI_load): Ditto. (c86_4g_m7_fp_div): Ditto. (c86_4g_m7_fp_div_load): Ditto. (c86_4g_m7_fp_idiv_load): Ditto. (c86_4g_m7_avx512_ssediv): Ditto. (c86_4g_m7_avx512_ssediv_mem): Ditto. (c86_4g_m7_avx512_ssediv_z): Ditto. (c86_4g_m7_avx512_ssediv_zmem): Ditto. (c86_4g_m7_avx512_sse_sqrt): Ditto. (c86_4g_m7_avx512_sse_sqrt_load): Ditto. (c86_4g_m7_fp_sqrt): Ditto. Rename from ... (c86_4g_m7fp_sqrt): ... here. * config/i386/c86-4g.md (c86_4g_idiv): New automaton. (c86_4g_fdiv): Ditto. (c86-4g-idiv): New unit. (c86-4g-fdiv): Ditto. (c86_4g_idiv_DI): Ditto. (c86_4g_idiv_SI): Ditto. (c86_4g_idiv_HI): Ditto. (c86_4g_idiv_QI): Ditto. (c86_4g_idiv_mem_DI): Ditto. (c86_4g_idiv_mem_SI): Ditto. (c86_4g_idiv_mem_HI): Ditto. (c86_4g_idiv_mem_QI): Ditto. (c86_4g_fp_sqrt): Ditto. (c86_4g_sse_sqrt_sf): Ditto. (c86_4g_sse_sqrt_sf_mem): Ditto. (c86_4g_sse_sqrt_df): Ditto. (c86_4g_sse_sqrt_df_mem): Ditto. (c86_4g_fp_op_div): Ditto. (c86_4g_fp_op_div_load): Ditto. (c86_4g_fp_op_idiv_load): Ditto. (c86_4g_ssediv_ss_ps): Ditto. (c86_4g_ssediv_ss_ps_load): Ditto. (c86_4g_ssediv_ss_pd): Ditto. (c86_4g_ssediv_ss_pd_load): Ditto. (c86_4g_ssediv_avx256_ps): Ditto. (c86_4g_ssediv_avx256_ps_load): Ditto. (c86_4g_ssediv_avx256_pd): Ditto. (c86_4g_ssediv_avx256_pd_load): Ditto. Signed-off-by: Kewen Lin <[email protected]>
