http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47764

--- Comment #3 from Carrot <carrot at google dot com> 2011-02-21 03:15:45 UTC 
---
> Any ideas of how this improvement could be implemented, Carrot?

The root cause of this problem is that arm/thumb store instruction can't
directly store a immediate number to memory, but gcc doesn't realize this early
enough. In most part of the rtl phase, the following form is kept.

  (insn 41 38 42 3 (set (mem:HI (plus:SI (reg/f:SI 169)
                  (const_int 60 [0x3c])) [2 MEM[(struct deflate_state *)D.2085 
  _3 + 60B]+0 S2 A16])
          (const_int 0 [0])) src/trees.c:45 696 {*thumb2_movhi_insn}
       (expr_list:REG_DEAD (reg/f:SI 169)
          (nil)))

Until register allocation it finds the restriction of the store instruction and
split it into two instructions, load 0 into register and store register to
memory. But it's too late to do a loop optimization.

One possible method is to split this insn earlier than loop optimization (maybe
directly in expand pass), and let loop and cse optimizations do the rest. It
may increase register pressure in part of the program, we should rematerialize
it in such cases.

Reply via email to