> From: Aurelien Jarno [mailto:aurel...@aurel32.net] > On 2015-06-18 10:12, Pavel Dovgaluk wrote: > > > From: Aurelien Jarno [mailto:aurel...@aurel32.net] > > > On 2015-06-17 15:41, Pavel Dovgalyuk wrote: > > > > In icount mode every translation block looks as follows: > > > > > > > > if icount < n then exit > > > > icount -= n > > > > instr1 > > > > instr2 > > > > ... > > > > instrn > > > > exit > > > > > > > > When one of these instructions initiates an exception, icount should be > > > > restored and adjusted number of instructions should be subtracted from > > > > icount > > > > instead of initial n. > > > > > > > > tlb_fill function passes retaddr to raise_exception, which allows > > > > restoring > > > > current instructions in TB and correct icount calculation. > > > > > > > > When exception triggered with other function (e.g. by embedding call to > > > > exception raising helper into TB), then PC is not passed as retaddr and > > > > correct icount is not recovered. In such cases icount will be decreased > > > > by the value equal to the size of TB. > > > > > > Looking at how icount work, I see it's basically a variable in the CPU > > > state (icount_decr.u16.low), which is already accessed from the TB. > > > Couldn't we adjust it using additional code before generating an > > > exception, when in icount mode. > > > > > > For example for MIPS, we can add some code before generate_exception > > > which use the value from s->gen_opc_icount[j] to adjust > > > the variable icount_decr.u16.low. > > > > It is possible, but it will incur additional overhead, because we will > > have to update icount every time the exception might be generated. > > We'll have to update icount value before and after every helper call, > > that can cause an exception: > > > > icount -= n > > ... > > instr_k > > icount += n - k > > helper > > icount -= n - k > > ... > > > > And this overhead will slowdown the code even if no exception occur. > > That's where I might disagree. Retranslation seems a very good idea on > the paper, but in practice it doesn't seems to always bring the > performance improvement it should. In addition it seems to be highly > dependent on the target. Just to give some numbers, on MIPS (as your > patch originally concerns this architecture), 40% of code generation is > actually due to retranslation. The problem is that over the time we have > improved a lot the code generation (liveness analysis, better register > allocation, constant propagation, ...) and thus we have increased the > code generation time. While it clearly has some benefits when this code > is actually executed, it's not the case when the code is simply > retranslated. In short we spend more time to find the CPU state > corresponding to an exception than before. > ... > > All of that to say that I am worried for the performances to see more > paths through the retranslation code, especially on MIPS as it seems to > be costly. That said I haven't really look in details at other targets, > nor hosts.
I fixed syscalls, exceptions that occur without any conditions, and removed redundant calls to save_cpu_state. Then I measured the performance without enabling icount. And Linux boots even faster than with original version. I'll submit this version for review soon. > Now to come back about your patches, we might want to simply fix icount > first, even if it has some performance impact, and deal with the > retranslation issue separately, as it concerns more than just icount. Pavel Dovgalyuk