Chaos Shu writes: > Hi > > I'm running SPEC CPU2006 on three kinds of situation, native aarch64 binary > and emulator x86_64 system running SPEC CPU2006 and linux user mode level > running x86_64 SPEC CPU2006 binary. > > To find where the performance lose, translator ? or execution of instruction > after TCG? Or something else > > I guess most of time, up to 90% should be spent on exec the > instruction of TCG, does that mean the quality of translating lead to > the performance lost directly ?
It really depends on the type of code you are executing but yes most of the time should be spent in TCG generated code. However if you are running a lot of FP heavy code you'll find it spends a lot of time in helper routines calling the internal softfloat code. I posted some patches a few months ago that enabled output to help the Linux "perf" tool track this. I haven't got time to re-work at the moment but it might give you a head start to instrumentation: https://patches.linaro.org/27229/ > > Thanks > Chaos > > On 29.05.2014 13:04, Peter Maydell wrote: >> No, we don't in general have any benchmarking of TCG codegen. I think >> if we did do benchmarking we'd be interested in performance >> benchmarking -- code expansion ratio doesn't seem like a very >> interesting thing to measure to me. > > Hi, > > I have a plan to play with TCG performance benchmarking. And then try to > implement some optimizations. So maybe there would be some suggestions on how > to perform such benchmarking? What tests seems to be appropriate for this > task? I think the benchmarking should reflect real TCG use cases. So what the > most typical use cases for TCG are there? Seems that system and user modes > may be different from this point. > > Appreciate any help. > > Thanks, > Sergey. -- Alex Bennée