Hi,all
I recently digged some source code of Jitrino.OPT and checked its output
with some simple programs. I find there are many ldvar/stvar instructions
in the dumped HIR of SSA form. They are essentially copy instructions and
cannot be eliminated due to the restriction of current SSA destruction
algorithm, which requires that live ranges of SSA variables shall not
overlap. Negatives of some of them can be eliminated by some phases (copy
propagation on LIR, coalescing of RA etc.), but others, like the following
example will remain in the final code. For complex real programs, this
problem may be more serious (resulting in more unnecessary MOV instructions
and higher register pressure). The following example may help to clarify
this problem.
Test.java:
public class Test
{
public static void main (String args[])
{
int i ,sum = 0, n = Integer.parseInt (args[0]);
int a[] = new_int (n);
for (i = 0; i < n; i++)
sum = sum + a[i];
System.out.println ("sum=" + sum);
}
public static int[] new_int (int n)
{
int a[] = new int[n];
for (int i = 0; i < n; i++)
a[i] = i;
return a;
}
}
Compiled with OpenJDK-1.6.0
Harmony revision: 761593
Command line for Harmony:
java -Xem:server_static -XX:jit.p.filter=.main -XX:jit.p.arg.log=irdump Test
100
In ct.log, the final generated code contains the following fragment:
BB_17
PersistentId = 46
ExecCnt = 8100
Loop: Depth=1, !hdr, hdr=BB_16
Predcessors: BB_16
Successors: BB_52 [Prob=1e-07](Br=I125) BB_16 [Prob=1](backedge)(Br=I125)
Layout Succ: BB_52 Block code address: 0xe9a197
* 0xe9a197 I299: MOV s123(EBP):I_32,v2(EBX):I_32
0xe9a199 I85: (ID:s12(EFLGS):U_32) =ADD
s123(EBP):I_32,t121[v25(EAX)+v0(EDI)*t119(4)+t42(12)]:I_32
* 0xe9a19d I298: MOV v2(EBX):I_32,s123(EBP):I_32
0xdeadbeef I169: (AD:s125(EDI):I_32) =CopyPseudoInst/MOV
(AU:v0(EDI):I_32)
0xe9a19f I87: (ID:s12(EFLGS):U_32) =ADD s125(EDI):I_32,t124(1):I_32
0xdeadbeef I88: (AD:v0(EDI):I_32) =CopyPseudoInst/MOV
(AU:s125(EDI):I_32)
0xe9a1a5 I124: (ID:s12(EFLGS):U_32) =CMP
t206[t203(ESI)+t212(4)]:I_32,t207(0):I_32
0xe9a1ac I125: JZ BB_16 t279(-27):I_8 (IU:s12(EFLGS):U_32)
The two MOV instructions marked with '*' are not necessary. They also
increase register pressure by one (EBP). Before hir2lir, the corresponding
HIR is
Block L49:
Predecessors: L50 L74
Successors: L3 L50
I249:L49: bcmap:unknown
* I250:ldvar v5 -) t144:I_32
I251:if cge.i4 t144, g16 goto L3
GOTO L50
Block L50:
Predecessors: L49
Successors: L49
I252:L50: bcmap:22
I253:ldvar v1 -) t145:I_32
I254:taupoint() -) t146:tau
I256:addindex g100, t144 -) t148:ref:I_32
I257:ldind.unc:i4 [t148] ((g84,t146)) -) t149:I_32
I258:add t145, t149 -) t150:I_32
I259:stvar t150 -) v1:I_32
I261:add t144, g76 -) t152:I_32
* I265:stvar t152 -) v5:I_32
GOTO L49
The two lines marked with '*' are the source of the unnecessary MOV
instructions. If t152, t144 and v5 are coalesced to one variable, the
back-end will avoid the problem immediately. Continue to trace the HIR
backward, this code is resulted by the restriction of SSA destruction
algorithm (no live range overlaps). We can see many ldvar/stvar in all
phases working on SSA form.
I think we should improve the SSA destruction algorithm according to [1] to
remove this restriction. This will enable a more complete copy propagation
and many other aggressive optimizations. We can also treat all SSA
variables equally (able to directly access) rather than treat those derived
from original variables specially (must access through ldvar/stvar).
Reference:
[1] Preston Briggs, Keith D. Cooper, Timothy J. Harvey, L. Taylor
Simpson. Practical Improvements to the Construction and Destruction of
Static Single Assignment Form. SOFTWARE|PRACTICE AND EXPERIENCE, VOL.
28(8), 1{28 (July 1998)
Best Regards,
Jiutao