Hi,

On Mon, Feb 25, 2013 at 4:08 PM, Steven Bosscher <stevenb....@gmail.com> wrote:
> On Tue, Feb 26, 2013 at 12:32 AM, Wei Mi wrote:
>> We also take insn splitting and peephole into consideration,
>> .i.e, the cost of the change is the cost after insn splitting and
>> peephole which may be applied to the insn changed. This is useful for
>> the motivational case,
>
> It also goes against everything ever done before in RTL land.
>
> If you need early splitting like this, then these insns should
> probably be splitted earlier. And trying to apply peephole2's at this
> stage is, ehm, very creative... I'm surprised it works at all,
> peephole2 is supposed to work on strict RTL (i.e. post-reload, all
> constraints matched, hard regiters only, etc.). NB, you can't use
> peephole2 unconditionally, not all targets have them. See
> HAVE_peephole2.
>
> Can you explain step-by-step what is going on, that you need this?
>

I am not trying to actually do the transformation of split and
peephole. Just want to know how does the insn look like after the
split and peephole, then we can decide whether to do fwprop based on
more precise cost caculation.

For the motivational case
IR before fwprop:
(insn 18 17 19 2 (parallel [
            (set (reg:SI 75 [ D.2322 ])
                (and:SI (reg:SI 88 [ D.2325 ])
                    (const_int 63 [0x3f])))
            (clobber (reg:CC 17 flags))
        ]) 1.c:25 402 {*andsi_1}
     (expr_list:REG_DEAD (reg:SI 88 [ D.2325 ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))
(insn 22 21 23 2 (parallel [
            (set (reg:DI 91 [ D.2324 ])
                (ashift:DI (reg:DI 71 [ D.2324 ])
                    (subreg:QI (reg:SI 75 [ D.2322 ]) 0)))
            (clobber (reg:CC 17 flags))
        ]) 1.c:25 522 {*ashldi3_1}
     (expr_list:REG_DEAD (reg:DI 71 [ D.2324 ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))
(insn 23 22 24 2 (parallel [
            (set (reg:DI 92 [ D.2324 ])
                (lshiftrt:DI (reg:DI 91 [ D.2324 ])
                    (subreg:QI (reg:SI 75 [ D.2322 ]) 0)))
            (clobber (reg:CC 17 flags))
        ]) 1.c:25 556 {*lshrdi3_1}
     (expr_list:REG_DEAD (reg:DI 91 [ D.2324 ])
        (expr_list:REG_DEAD (reg:SI 75 [ D.2322 ])
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil)))))

After propagation, we get
(insn 22 21 23 2 (parallel [
                (set (reg:DI 91 [ D.2324 ])
                    (ashift:DI (reg:DI 71 [ D.2324 ])
                        (subreg:QI (and:SI (reg:SI 88 [ D.2325 ])
                                (const_int 63 [0x3f])) 0)))
                (clobber (reg:CC 17 flags))
            ]) 1.c:25 518 {*ashldi3_mask}
         (expr_list:REG_DEAD (reg:DI 71 [ D.2324 ])
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil))))
(insn 23 22 24 2 (parallel [
                (set (reg:DI 92 [ D.2324 ])
                    (lshiftrt:DI (reg:DI 91 [ D.2324 ])
                        (subreg:QI (and:SI (reg:SI 88 [ D.2325 ])
                                (const_int 63 [0x3f])) 0)))
                (clobber (reg:CC 17 flags))
            ]) 1.c:25 539 {*lshrdi3_mask}
         (expr_list:REG_DEAD (reg:DI 91 [ D.2324 ])
            (expr_list:REG_DEAD (reg:SI 75 [ D.2322 ])
                (expr_list:REG_UNUSED (reg:CC 17 flags)
                    (nil)))))

But it is not a good transformation unless we know insn split will
change a << (b & 63) to a << b; Here we want to see what the rtl looks
like after insn splitting in fwprop cost estimation (We call
split_insns in estimate_split_and_peephole(), but not to do insn
splitting actually in this phase). After checking the result of
split_insns, we decide it is beneficial to do the propagation here
because insn splitting will optimize the fwprop results.

After insn splitting.
(insn 39 21 40 2 (parallel [
            (set (reg:DI 92 [ D.2324 ])
                (ashift:DI (reg:DI 71 [ D.2324 ])
                    (subreg:QI (reg:SI 88 [ D.2325 ]) 0)))
            (clobber (reg:CC 17 flags))
        ]) 1.c:25 -1
     (nil))
(insn 40 39 24 2 (parallel [
            (set (reg:DI 92 [ D.2324 ])
                (lshiftrt:DI (reg:DI 92 [ D.2324 ])
                    (subreg:QI (reg:SI 88 [ D.2325 ]) 0)))
            (clobber (reg:CC 17 flags))
        ]) 1.c:25 -1
     (nil))

Similarly I think if we can know what the insn looks like after
peephole, it will be better for fwprop. But I don't have a testcase to
prove it is better to consider peephole. Maybe I should do some
testing how much benefit we will get if consider peephole impact in
fwprop.

>> -      break;
>> +      /* Compare registers by number.  */
>> +    case REG:
>> +      return REG_P (reg) && REGNO (in) == REGNO (reg);
>
> This will not work for hard registers.
>
> FWIW, en passant you've made fwprop quadratic in the number of insns
> in a basic block, in initialize_before_estimate_peephole potentially
> calling simulate_backwards_to_point repeatedly on every insn in a
> basic block.

Yes. I may need to evaluate the benefit of considering peephole impact
in fwprop and its compilation time cost.

>
> Also: no ChangeLog, not following code style conventions, no comments,
> entire blocks of recently added code disappearing (e.g. the "Do not
> replace an existing REG_EQUAL note" stuff)...
>
> Don't mean to be too harsh, but in this form I'm not going to look at it.

Sorry, I sent it out for an early discussion, so I neglect ChangeLog,
code style...  I will fix them and add more comments. About the
REG_EQUAL and forward_propagate_asm deleted, my code hasn't dealed
with them in a good way, so I just deleted them in the patch for easy
discussion. But they will be added back soon. I was looking into the
tricky bug about REG_EQUAL you fixed not a long ago these two days to
understand how to deal with REG_EQUAL.

>
> Ciao!
> Steven

Thanks,
Wei.

Reply via email to