On Thu, Sep 27, 2012 at 04:28:47PM -0700, Richard Henderson wrote: > On 09/27/2012 04:20 PM, Aurelien Jarno wrote: > > I understand that we can't easily insert an instruction, so the > > limitation comes from here, but is it really something happening often? > > It will certainly appear sometimes. E.g. s390x has an add immediate > instruction that does exactly: r1 += imm16 << 32. > > Or did you mean specifically the full constant being folded? That > would happen quite a bit more often. That you can see with most any > 64-bit RISC guest when they attempt to generate a constant from > addition primitives instead of logical primitives. > > For a 32-bit host, we've already decomposed logical primitives to 32-bit > operations. And we can constant-fold through all of those. But when > addition comes into play, we can't constant-fold through add2. >
I tried this patch on an i386 host running an x86_64 target, but it even fails to start seabios, there is probably a wrong logic somewhere in the patch. For the first add2 that seemed to have work correctly, this patch optimized 0.2% of them. I am not sure it worth it as is. I think optimizing add2, and in general all *2 ops is a good idea, but we should be able to do more agressive optimization. Maybe, a bit like Blue was suggesting, add2 should always be followed by a nop, so we can do more optimizations? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net