On Tue, Jan 8, 2019 at 12:43 PM Jakub Jelinek <ja...@redhat.com> wrote:
>
> On Tue, Jan 08, 2019 at 11:49:10AM +0100, Uros Bizjak wrote:
> > FLD from memory in SF and DFmode is considered a conversion, and
> > converts sNaN to NaN (and emits #IA exception). But sNaN handling is
> > already busted in the compiler as RA is free to spill the register in
> > non-XFmode. IMO, the peephole2 pattern is no worse than the current
> > situation.
>
> Ok.
>
> > At least for x86, there are no SUBREGs after reload, otherwise other
> > parts of the compiler would break.
>
> The new patch would really handle even a SUBREG there...
>
> > > I don't see how, that would mean I'd have to write two peephole2s instead 
> > > of
> > > one.  It tries to deal with two different cases, one is where the 
> > > temporary
> > > reg is dead, in that case we can optimize away both the load or store, the
> > > second case is where the temporary reg isn't dead, in that case we can
> > > optimize away the store, but not the load.  With the optimizing away of 
> > > both
> > > load and store I was just trying to do a cheap DCE there.
> >
> > I didn't realize this is an optimization, a comment would be welcome here.
>
> Ugh, except that it doesn't work.  peep2_reg_dead_p (1, operands[0])
> is not what I meant, that is always false, as the register must be live in
> between the first and second instruction.  I meant
> peep2_reg_dead_p (2, operands[0]), the register dead at the end of the
> second instruction, except we don't really support
> define_split/define_peephole2 splitting into zero instructions, DONE; in
> that case returns NULL like FAIL; does.  So, let's just wait for DCE to
> finish it up.
>
> Here is what I'll bootstrap/regtest then.  Added also
> reg_overlap_mentioned_p, in case there is e.g.
>   movl (%eax,%edx), %eax
>   movl %eax, (%eax,%edx)

I doubt this would *ever* happen, but ... OK.

> or similar and as I said earlier, explicit match_operand so that I can
> check MEM_VOLATILE_P on both MEMs.
>
> 2019-01-08  Jakub Jelinek  <ja...@redhat.com>
>
>         PR rtl-optimization/79593
>         * config/i386/i386.md (reg = mem; mem = reg): New define_peephole2.

OK for mainline.

Thanks,
Uros.

> --- gcc/config/i386/i386.md.jj  2019-01-07 23:54:54.494800693 +0100
> +++ gcc/config/i386/i386.md     2019-01-08 12:34:18.916832780 +0100
> @@ -18740,6 +18740,18 @@ (define_peephole2
>                        const0_rtx);
>  })
>
> +;; Attempt to optimize away memory stores of values the memory already
> +;; has.  See PR79593.
> +(define_peephole2
> +  [(set (match_operand 0 "register_operand")
> +        (match_operand 1 "memory_operand"))
> +   (set (match_operand 2 "memory_operand") (match_dup 0))]
> +  "!MEM_VOLATILE_P (operands[1])
> +   && !MEM_VOLATILE_P (operands[2])
> +   && rtx_equal_p (operands[1], operands[2])
> +   && !reg_overlap_mentioned_p (operands[0], operands[2])"
> +  [(set (match_dup 0) (match_dup 1))])
> +
>  ;; Attempt to always use XOR for zeroing registers (including FP modes).
>  (define_peephole2
>    [(set (match_operand 0 "general_reg_operand")
>
>
>         Jakub

Reply via email to