On Fri, Oct 26, 2018 at 9:19 AM H.J. Lu <[email protected]> wrote:
>
> On 10/25/18, Uros Bizjak <[email protected]> wrote:
> > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu <[email protected]> wrote:
> >>
> >> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
> >> a wrong way. For example:
> >>
> >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> >> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >> (any_extend:V8HI
> >> (vec_select:V8QI
> >> (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
> >> (parallel [(const_int 0) (const_int 1)
> >> (const_int 2) (const_int 3)
> >> (const_int 4) (const_int 5)
> >> (const_int 6) (const_int 7)]))))]
> >>
> >> should be defind for memory operands as:
> >>
> >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> >> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >> (any_extend:V8HI
> >> (match_operand:V8QI "memory_operand" "m,m,m")))]
> >>
> >> This set of patches updates them to
> >>
> >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> >> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >> (any_extend:V8HI
> >> (vec_select:V8QI
> >> (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
> >> (parallel [(const_int 0) (const_int 1)
> >> (const_int 2) (const_int 3)
> >> (const_int 4) (const_int 5)
> >> (const_int 6) (const_int 7)]))))]
> >>
> >> (define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
> >> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >> (any_extend:V8HI
> >> (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
> >>
> >> with a splitter:
> >>
> >> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
> >> [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >
> > No constraints needed for pre-reload splitter.
> >
> >> (any_extend:V8HI
> >> (vec_select:V8QI
> >> (subreg:V16QI
> >> (vec_concat:V2DI
> >> (match_operand:DI 1 "memory_operand" "m,*m,m")
> >> (const_int 0)) 0)
> >> (parallel [(const_int 0) (const_int 1)
> >> (const_int 2) (const_int 3)
> >> (const_int 4) (const_int 5)
> >> (const_int 6) (const_int 7)]))))]
> >> "TARGET_SSE4_1 && <mask_avx512bw_condition> &&
> >> <mask_avx512vl_condition>"
> >> "#"
> >> "&& can_create_pseudo_p ()"
> >> [(set (match_dup 0) (match_dup 1))]
> >
> > [(set (match_dup 0)
> > (any_extend:V8HI (match_dup 1)))]
> >
> >> {
> >> operands[1] = gen_rtx_<CODE> (V8HImode,
> >> gen_rtx_SUBREG (V8QImode,
> >> operands[1], 0));
> >> })
> >
> > Don't create subregs of memory. Use adjust_address_nv.
>
> Here is the updated patch.
> with a splitter:
>
> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
> [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
> (vec_select:V8QI
> (subreg:V16QI
> (vec_concat:V2DI
> (match_operand:DI 1 "memory_operand")
> (const_int 0)) 0)
> (parallel [(const_int 0) (const_int 1)
> (const_int 2) (const_int 3)
> (const_int 4) (const_int 5)
> (const_int 6) (const_int 7)]))))]
> "TARGET_SSE4_1 && <mask_avx512bw_condition> && <mask_avx512vl_condition>"
> "#"
> "&& can_create_pseudo_p ()"
"can_create_pseudo_p ()" should go to the insn constraint and "&& 1"
should be used for split constraint. Both, insn and splitter are valid
only before reload.
> [(set (match_dup 0)
> (any_extend:V8HI (match_dup 1)))]
> {
> operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
> })
Please use double quotes for one-line preparation statement.
> (any_extend:V4SI
> (match_operand:V4HI 1 "memory_operand" "m,*m,m")))]
Please remove star in front of memory constraint.
OK with the above changes.
Thanks,
Uros.