On Fri, Jan 27, 2017 at 12:11:05PM -0600, Aaron Sawdey wrote:
> The updated memcmp-1 testcase passes on ppc64le (p8/p9), ppc64 (p8/p9),
>  ppc32 (p8), and x86_64. Bootstrap was successful on ppc64/ppc64le.
> Assuming regtest on ppc64/ppc64le passes, ok for trunk?

> +          ldbrx 10,6,9
> +          ldbrx 9,7,9
> +          subf. 9,9,10
> +          bne 0,.L8

subfc. 9,9,10

> +          addi 9,4,7
> +          lwbrx 10,0,9
> +          addi 9,5,7
> +          lwbrx 9,0,9

It would be nice if this was

        li 9,7
        lwbrx 10,9,4
        lwbrx 9,9,5

but that is a generic problem I bet.

> +          subfc 9,9,10
> +          b .L9
>       .L8: # convert_label
> -             cntlzd 9,9
> -             addi 9,9,-1
> -             xori 9,9,0x3f
> +          subfe 10,10,10
> +          popcntd 9,9
> +          rldimi 9,10,6,0
>       .L9: # final_label

The code does not generate rldimi anymore, always just "or".

> +      while maintaining <0 / ==0 / >0 properties. This sequence works:
> +      subfc L,A,B
> +      subfe H,H,H
> +      popcntd L,L
> +      rldimi L,H,6,0

"or" here, as well.

> --- gcc/config/rs6000/rs6000.md       (revision 244952)
> +++ gcc/config/rs6000/rs6000.md       (working copy)
> @@ -2068,6 +2068,35 @@
>    "subfic %0,%1,%2"
>    [(set_attr "type" "add")])
>  
> +(define_insn_and_split "subf<mode>3_carry_dot2"
> +  [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
> +     (compare:CC (minus:P (match_operand:P 2 "gpc_reg_operand" "r,r")
> +                            (match_operand:P 1 "gpc_reg_operand" "r,r"))
> +                 (const_int 0)))
> +   (set (match_operand:P 0 "gpc_reg_operand" "=r,r")
> +     (minus:P (match_dup 2)
> +                (match_dup 1)))
> +   (set (reg:P CA_REGNO)
> +     (leu:P (match_dup 1)
> +            (match_dup 2)))]
> +  "<MODE>mode == Pmode"
> +  "@
> +   subfc. %0,%1,%2
> +   #"
> +  "&& reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)"

So far so good...

> +  [(set (reg:P CA_REGNO)
> +     (leu:P (match_dup 1)
> +            (match_dup 2)))
> +   (set (match_dup 0)
> +     (minus:P (match_dup 2)
> +                (match_dup 1)))
> +   (set (match_dup 3)
> +     (compare:CC (match_dup 0)
> +                 (const_int 0)))]

This needs a "parallel" around the two pieces that together are the subfc
instruction you split to.  They also need to be in the correct order.  So:

  [(parallel [(set (match_dup 0)
                   (minus:P (match_dup 2)
                            (match_dup 1)))
              (set (reg:P CA_REGNO)
                   (leu:P (match_dup 1)
                          (match_dup 2)))])
   (set (match_dup 3)
        (compare:CC (match_dup 0)
                    (const_int 0)))]

The rest looks good.  With those fixes the patch is approved for trunk
(if all testing works out ;-) )

Thanks,


Segher

Reply via email to