On Mon, Feb 25, 2019 at 2:53 PM Jakub Jelinek <ja...@redhat.com> wrote: > > Hi! > > The following patch fixes two issues in the new rpad pass. > One is that the insertion at the start of a basic block didn't work properly > if the basic block didn't contain any non-NOTE/non-DEBUG_INSN instructions. > next_nonnote_nondebug_insn hapilly turns through into another basic block > and the insertion can insert an instruction in between basic blocks or into > a different basic block from where we wanted to emit it. > I believe we want to emit it after CODE_LABEL, after NOTE_INSN_BASIC_BLOCK > and if possible, after debug insns in there, the patch emits it before the > first normal insn in the bb if any, or after the BB_END (thus extending > BB_END). > > Another issue is that it is quite weird/dangerous to add the v4sf_const0 > pseudo uses in lots of places in the IL, register those changes with df, > then do df_analyze with different flags and finally emit the setter. > I understand the goal was not to do df_analyze etc. in the usual case where > there are no instructions that need this treatment. This patch does the > df_analyze at the spot we find the first insn, but before we actually change > that instruction, so the changes are after the df_analyze. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2019-02-25 Jakub Jelinek <ja...@redhat.com> > > PR target/89474 > * config/i386/i386.c (remove_partial_avx_dependency): Call > df_analyze etc. before creation of the v4sf_const0 pseudo, rather than > after changing possibly many instructions to use that pseudo. Fix up > insertion of v4sf_const0 setter at the start of bb. > > * gcc.target/i386/pr89474.c: New test. > > --- gcc/config/i386/i386.c.jj 2019-02-22 23:02:47.805117610 +0100 > +++ gcc/config/i386/i386.c 2019-02-25 14:20:05.793608879 +0100 > @@ -2835,7 +2835,14 @@ remove_partial_avx_dependency (void) > continue; > > if (!v4sf_const0) > - v4sf_const0 = gen_reg_rtx (V4SFmode); > + { > + calculate_dominance_info (CDI_DOMINATORS); > + df_set_flags (DF_DEFER_INSN_RESCAN); > + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN); > + df_md_add_problem (); > + df_analyze (); > + v4sf_const0 = gen_reg_rtx (V4SFmode); > + } > > /* Convert PARTIAL_XMM_UPDATE_TRUE insns, DF -> SF, SF -> DF, > SI -> SF, SI -> DF, DI -> SF, DI -> DF, to vec_dup and > @@ -2883,12 +2890,6 @@ remove_partial_avx_dependency (void) > > if (v4sf_const0) > { > - calculate_dominance_info (CDI_DOMINATORS); > - df_set_flags (DF_DEFER_INSN_RESCAN); > - df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN); > - df_md_add_problem (); > - df_analyze (); > - > /* (Re-)discover loops so that bb->loop_father can be used in the > analysis below. */ > loop_optimizer_init (AVOID_CFG_MODIFICATIONS); > @@ -2904,11 +2905,23 @@ remove_partial_avx_dependency (void) > bb = get_immediate_dominator (CDI_DOMINATORS, > bb->loop_father->header); > > - insn = BB_HEAD (bb); > - if (!NONDEBUG_INSN_P (insn)) > - insn = next_nonnote_nondebug_insn (insn); > set = gen_rtx_SET (v4sf_const0, CONST0_RTX (V4SFmode)); > - set_insn = emit_insn_before (set, insn); > + > + insn = BB_HEAD (bb); > + while (insn && !NONDEBUG_INSN_P (insn)) > + { > + if (insn == BB_END (bb)) > + { > + insn = NULL; > + break; > + } > + insn = NEXT_INSN (insn); > + } > + if (insn == BB_HEAD (bb)) > + set_insn = emit_insn_before (set, insn); > + else > + set_insn = emit_insn_after (set, > + insn ? PREV_INSN (insn) : BB_END (bb)); > df_insn_rescan (set_insn); > df_process_deferred_rescans (); > loop_optimizer_finalize (); > --- gcc/testsuite/gcc.target/i386/pr89474.c.jj 2019-02-25 14:21:51.651867104 > +0100 > +++ gcc/testsuite/gcc.target/i386/pr89474.c 2019-02-25 14:21:34.373151405 > +0100 > @@ -0,0 +1,14 @@ > +/* PR target/89474 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mavx" } */ > + > +int a; > +void foo (double); > +int baz (void); > + > +void > +bar (void) > +{ > + while (baz ()) > + foo (a); > +} > > Jakub
Thanks. -- H.J.