> Jan Hubicka wrote: > > > I though the comment was more reffering to fact that we will happily > > generate > > movl $0x0, place1 > > movl $0x0, place2 > > ... > > movl $0x0, placeMillion > > > > rather than shorter > > xor %eax, %eax > > movl %eax, ... > > Yes, that would be an improvement, but, as you say, at some point we > want to call memset. > > > With the repeated mov issue unforutnately I don't know what would be the > > best place: we obviously don't want to constrain register allocation too > > much and after regalloc I guess only machine dependent pass > > I would hope that we could notice this much earlier than that. Wouldn't > this be evident even at the tree level or at least after > stack-allocation in the RTL layer? I wouldn't expect the zeroing to be > coming from machine-dependent code.
What I meant is the generic problem that constants on i386 (especially for moves) increase instruction encoding and thus when mutiple copies of the same constant appears in the instruction stream and register is available one can add extra move and use that register instead. Of course we can also have pass detecting large sets of unwound mov instructions and pack them into memset. We can do it either at early RTL level or with some lowering of initializers at tree level too (I guess many of those sequences actally come from expanding initializers that are sort of black boxes for most tree optimizers). Sort of similar transformation is done by Tomas who can use vectorizer infrastructure to detect loops doing memset/memcpy. Those are pretty common especially for floats/doubles and after unrolling also loeads to such a sequences. I hope he will polish and send the patch soonish. Honza > > One possibility is that we're doing something dumb with arrays. Another > possibility is that we're SRA-ing a lot of small structures, which add > up to a ton of stack space. > > I realize that we need a full bug report to be sure, though. > > -- > Mark Mitchell > CodeSourcery > [EMAIL PROTECTED] > (650) 331-3385 x713
