Sorry for late reply. I just got back from vacation (a week).
I was planning to finish this patch after vacation. It seems that you almost 
finished.
That's great! Thank you so much.


juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2022-10-07 20:24
To: juzhe.zhong
CC: gcc-patches; richard.sandiford
Subject: Re: [PATCH] Add first-order recurrence autovectorization
On Thu, Oct 6, 2022 at 3:07 PM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Thu, Oct 6, 2022 at 2:13 PM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Fri, Sep 30, 2022 at 10:00 AM <juzhe.zh...@rivai.ai> wrote:
> > >
> > > From: Ju-Zhe Zhong <juzhe.zh...@rivai.ai>
> > >
> > > Hi, After fixing previous ICE.
> > > I add full implementation (insert permutation to get correct result.)
> > >
> > > The gimple IR is correct now I think:
> > >   # t_21 = PHI <_4(6), t_12(9)>
> > >   # i_22 = PHI <i_17(6), 0(9)>
> > >   # vectp_a.6_26 = PHI <vectp_a.6_25(6), a_14(D)(9)>
> > >   # vect_vec_recur_.9_9 = PHI <vect__4.8_19(6), vect_cst__18(9)>
> > >   # vectp_b.11_7 = PHI <vectp_b.11_30(6), b_15(D)(9)>
> > >   # curr_cnt_36 = PHI <next_cnt_35(6), _32(9)>
> > >   # loop_len_20 = PHI <next_len_34(6), _32(9)>
> > >   _38 = .WHILE_LEN (loop_len_20, 32, POLY_INT_CST [4, 4]);
> > >   while_len_37 = _38;
> > >   _1 = (long unsigned int) i_22;
> > >   _2 = _1 * 4;
> > >   _3 = a_14(D) + _2;
> > >   vect__4.8_19 = .LEN_LOAD (vectp_a.6_26, 32B, loop_len_20, 0);
> > >   _4 = *_3;
> > >   _5 = b_15(D) + _2;
> > >   vect_vec_recur_.9_9 = VEC_PERM_EXPR <vect_vec_recur_.9_9, vect__4.8_19, 
> > > { POLY_INT_CST [3, 4], POLY_INT_CST [4, 4], POLY_INT_CST [5, 4], ... }>;
> > >
> > > But I encounter another ICE:
> > > 0x169e0e7 process_bb
> > >         ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:7498
> > > 0x16a09af do_rpo_vn(function*, edge_def*, bitmap_head*, bool, bool, 
> > > vn_lookup_kind)
> > >         ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:8109
> > > 0x16a0fe7 do_rpo_vn(function*, edge_def*, bitmap_head*)
> > >         ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:8205
> > > 0x179b7db execute
> > >         ../../../riscv-gcc/gcc/tree-vectorizer.cc:1365
> > >
> > > Could you help me with this? After fixing this ICE, I think the loop 
> > > vectorizer
> > > can run correctly. Maybe you can test is in X86 or ARM after fixing this 
> > > ICE.
> >
> > Sorry for the late reply, the issue is that we have
> >
> > vect_vec_recur_.7_7 = VEC_PERM_EXPR <vect_vec_recur_.7_7, vect__4.6_9,
> > { 7, 8, 9, 10, 11, 12, 13, 14 }>;
> >
> > thus
> >
> > +      for (unsigned i = 0; i < ncopies; ++i)
> > +       {
> > +         gphi *phi = as_a<gphi *> (STMT_VINFO_VEC_STMTS 
> > (def_stmt_info)[i]);
> > +         tree latch = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop));
> > +         tree recur = gimple_phi_result (phi);
> > +         gassign *assign
> > +           = gimple_build_assign (recur, VEC_PERM_EXPR, recur, latch, 
> > perm);
> > +         gimple_assign_set_lhs (assign, recur);
> >
> > needs to create a new SSA name for each LHS.  You shouldn't create code in
> > vect_get_vec_defs_for_operand either.
> >
> > Let me mangle the patch a bit.
> >
> > The attached is what I came up with, the permutes need to be generated when
> > the backedge PHI values are filled in.  Missing are ncopies > 1 handling, 
> > we'd
> > need to think of how the initial value and the permutes would work here, 
> > missing
> > is SLP support but more importantly handling in the epilogue (so on x86 
> > requires
> > constant loop bound)
> > I've added a testcase that triggers on x86_64.
>
> Actually I broke it, the following is more correct.
 
So let me finish the patch.  I have everything besides the epilogue
handling done,
I'll get to that somewhen next week.
 
Richard.
 
> Richard.
>
> > Richard.
 

Reply via email to