Martok wrote: > a:= CurrentHash[0]; b:= CurrentHash[1]; c:= CurrentHash[2]; d:= > CurrentHash[3]; > 0000000100074943 488b8424a0020000 mov 0x2a0(%rsp),%rax > 000000010007494B 4c8b5038 mov 0x38(%rax),%r10 > 000000010007494F 488b8424a0020000 mov 0x2a0(%rsp),%rax > 0000000100074957 4c8b5840 mov 0x40(%rax),%r11 > 000000010007495B 488b9424a0020000 mov 0x2a0(%rsp),%rdx > 0000000100074963 488b4248 mov 0x48(%rdx),%rax > 0000000100074967 488b9424a0020000 mov 0x2a0(%rsp),%rdx > 000000010007496F 488b6a50 mov 0x50(%rdx),%rbp > > Every single one of the "mov 0x2a0(%rsp), %rxx" instructions except the first > is > redundant and causes another memory round-trip. At the same time, more > registers > are used, which probably makes other optimizations more difficult, especially > when something similar happens on i386. > > Now, the fun part: I haven't been able to build a simple test that causes the > same issue (the self-pointer already is in %rcx and not fetched from the stack > each time), so I have a feeling this may be a side effect of some other part > of > the code.
It's called register spilling: once there are no registers left to hold values, the compiler has to pick registers whose value will be kept in memory instead. Register allocation is an NP-complete problem, so the result will never be 100% optimal (at least if you don't want to wait forever while the compiler checks out all possible assignments). One possible heuristic, which is used by FPC's register allocator, is to spill the register that conflicts with the largest number of other registers (to minimise the number of registers spilled to memory). There are techniques to more optimally spill (e.g. live range splitting), and there are also other kinds of optimisations that could be run after register allocation to make the code more optimal. CSE at the assembler level could be used in this case. That's a very complex undertaking for relatively little gain though. E.g. those memory loads are probably optimised by the processor itself (not necessarily coming even from the L1 cache, but possibly from the write-back buffer). Jonas _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel