Re: [Qemu-devel] profiling qemu

Laurent Desnogues Tue, 14 Feb 2012 06:48:25 -0800

2012/2/14 Lluís Vilanova <vilan...@ac.upc.edu>:
> Artyom Tarasenko writes:
[...]
>> Here it looks like "compute_all_sub" and "compute_all_sub_xcc" are
>> good candidates for optimizing: together they take the same amount of
>> time as cpu_sparc_exec. I guess both operations would be trivial in
>> the x86_64 assembler. What would be the best strategy to make TCG take
>> the advantage of running on a x86_64 host?
>
> A quick look into the code reveals that these two are called from a TCG helper
> (helper_compute_psr), so I see two approaches here applicable to the most
> frequently used "sub-operations" in helper_compute_psr:
>
> * Define new simpler helpers for those sub-operations that can be declared 
> with
>  TCG_CALL_CONST and generate the new psr/xcc values in temporal registers. You
>  must make sure any other code will still be able to use the new psr/xcc
>  values.
>
> * Reimplement these sub-operations in pure TCG code.
>
>
> But first, make sure you run a proper benchmark to establish where are the
> hotspots in the sparc code for QEMU. The problem here is to establish what a
> proper benchmark is :)


Similar helpers are used in ARM translation, so I'm not surprised
they show up (typically sub/flag instructions are used for loops).

A good strategy is indeed to generate TCG code and let the
NZ/C/etc. be global temps as other CPU registers.  This gains a
few percents of speed.

HTH,

Laurent

Re: [Qemu-devel] profiling qemu

Reply via email to