Hi all, I suspect that guest memory access (qemu_ld/qemu_st) account for the major of time spent in system mode. I would like to know precisely how much (if possible). We use tools like perf [1] before, but since the logic of guest memory access aslo embedded in the host binary not only helper functions, the result cannot be relied. The current idea is adding helper functions before/after guest memory access logic. Take ARM guest on x86_64 host for example, should I add the helper functions before/after tcg_gen_qemu_{ld,st} in target-arm/translate.c or tcg_out_qemu_{ld,st} in tcg/i386/tcg-target.c? Or there is a better way to know how much time QEMU spend on handling guest memory access?
Any suggestion/comment is welcomed. Thanks! Regards, chenwj [1] https://perf.wiki.kernel.org/index.php/Main_Page -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj