On Wed, Jun 13, 2012 at 5:14 AM, 陳韋任 (Wei-Ren Chen)
<che...@iis.sinica.edu.tw> wrote:
> Hi all,
>
>  I suspect that guest memory access (qemu_ld/qemu_st) account for the major of
> time spent in system mode. I would like to know precisely how much (if 
> possible).
> We use tools like perf [1] before, but since the logic of guest memory access 
> aslo
> embedded in the host binary not only helper functions, the result cannot be
> relied. The current idea is adding helper functions before/after guest memory
> access logic. Take ARM guest on x86_64 host for example, should I add the 
> helper
> functions before/after tcg_gen_qemu_{ld,st} in target-arm/translate.c or
> tcg_out_qemu_{ld,st} in tcg/i386/tcg-target.c? Or there is a better way to 
> know
> how much time QEMU spend on handling guest memory access?

I'm afraid there's no easy way to measure that: any change you make
to generated code will completely change the timing given that the ld/st
fast path is only a few instructions long.

Another approach might be to run the program in user mode and then
in system mode (provided the guest OS is very light).

As a side note, it might be interesting to gather statistics about the hit
rate of the QEMU TLB.  Another thing to consider is speeding up the
fast path;  see YeongKyoon Lee RFC patch:

http://www.mail-archive.com/qemu-devel@nongnu.org/msg91294.html


Laurent

Reply via email to