Reflect recent changes on API (inline ops) and new plugins. Signed-off-by: Pierrick Bouvier <pierrick.bouv...@linaro.org> --- docs/devel/tcg-plugins.rst | 101 +++++++++++++++++++++++-------------- 1 file changed, 63 insertions(+), 38 deletions(-)
diff --git a/docs/devel/tcg-plugins.rst b/docs/devel/tcg-plugins.rst index 954623f9bf1..5d2ebb92977 100644 --- a/docs/devel/tcg-plugins.rst +++ b/docs/devel/tcg-plugins.rst @@ -29,8 +29,8 @@ Once built a program can be run with multiple plugins loaded each with their own arguments:: $QEMU $OTHER_QEMU_ARGS \ - -plugin contrib/plugin/libhowvec.so,inline=on,count=hint \ - -plugin contrib/plugin/libhotblocks.so + -plugin contrib/plugins/libhowvec.so,inline=on,count=hint \ + -plugin contrib/plugins/libhotblocks.so Arguments are plugin specific and can be used to modify their behaviour. In this case the howvec plugin is being asked to use inline @@ -41,6 +41,14 @@ Linux user-mode emulation also evaluates the environment variable QEMU_PLUGIN="file=contrib/plugins/libhowvec.so,inline=on,count=hint" $QEMU +QEMU plugins avoid to write directly to stdin/stderr, and use the log provided +by the API (see function ``qemu_plugin_outs``). +To show output, you may use this additional parameter:: + + $QEMU $OTHER_QEMU_ARGS \ + -d plugin \ + -plugin contrib/plugins/libhowvec.so,inline=on,count=hint + Writing plugins --------------- @@ -93,11 +101,14 @@ translation event the plugin has an option to enumerate the instructions in a block of instructions and optionally register callbacks to some or all instructions when they are executed. -There is also a facility to add an inline event where code to -increment a counter can be directly inlined with the translation. -Currently only a simple increment is supported. This is not atomic so -can miss counts. If you want absolute precision you should use a -callback which can then ensure atomicity itself. +There is also a facility to add inline instructions doing various operations, +like adding or storing an immediate value. It is also possible to execute a +callback conditionally, with condition being evaluated inline. All those inline +operations are associated to a ``scoreboard``, which is a thread-local storage +automatically expanded when new cores/threads are created and that can be +accessed/modified in a thread-safe way without any lock needed. Combining inline +operations and conditional callbacks offer a more efficient way to instrument +binaries, compared to classic callbacks. Finally when QEMU exits all the registered *atexit* callbacks are invoked. @@ -117,9 +128,9 @@ However the following assumptions can be made: Translation Blocks ++++++++++++++++++ -All code will go through a translation phase although not all -translations will be necessarily be executed. You need to instrument -actual executions to track what is happening. +All code will go through a translation phase although not all translations will +necessarily be executed. You need to instrument actual executions to track what +is happening. It is quite normal to see the same address translated multiple times. If you want to track the code in system emulation you should examine @@ -135,13 +146,12 @@ change control flow mid-block. Instructions ++++++++++++ -Instruction instrumentation runs before the instruction executes. You -can be can be sure the instruction will be dispatched, but you can't -be sure it will complete. Generally this will be because of a -synchronous exception (e.g. SIGILL) triggered by the instruction -attempting to execute. If you want to be sure you will need to -instrument the next instruction as well. See the ``execlog.c`` plugin -for examples of how to track this and finalise details after execution. +Instruction instrumentation runs before the instruction executes. You can be +sure the instruction will be dispatched, but you can't be sure it will complete. +Generally this will be because of a synchronous exception (e.g. SIGILL) +triggered by the instruction attempting to execute. If you want to be sure you +will need to instrument the next instruction as well. See the ``execlog.c`` +plugin for examples of how to track this and finalise details after execution. Memory Accesses +++++++++++++++ @@ -200,12 +210,12 @@ encouraged to contribute your own plugins plugins upstream. There is a basic plugins that are used to test and exercise the API during the ``make check-tcg`` target in ``tests\plugins``. -- tests/plugins/empty.c +- tests/plugin/empty.c Purely a test plugin for measuring the overhead of the plugins system itself. Does no instrumentation. -- tests/plugins/bb.c +- tests/plugin/bb.c A very basic plugin which will measure execution in course terms as each basic block is executed. By default the results are shown once @@ -220,14 +230,13 @@ Behaviour can be tweaked with the following arguments: * inline=true|false - Use faster inline addition of a single counter. Not per-cpu and not - thread safe. + Use faster inline addition of a single counter. * idle=true|false Dump the current execution stats whenever the guest vCPU idles -- tests/plugins/insn.c +- tests/plugin/insn.c This is a basic instruction level instrumentation which can count the number of instructions executed on each core/thread:: @@ -250,8 +259,7 @@ Behaviour can be tweaked with the following arguments: * inline=true|false - Use faster inline addition of a single counter. Not per-cpu and not - thread safe. + Use faster inline addition of a single counter. * sizes=true|false @@ -267,18 +275,18 @@ Behaviour can be tweaked with the following arguments: -d plugin ./tests/tcg/aarch64-linux-user/sha512-vector ... 0x40069c, 'bl #0x4002b0', 10 hits, 1093 match hits, Δ+1257 since last match, 98 avg insns/match - 0x4006ac, 'bl #0x403690', 10 hits, 1094 match hits, Δ+47 since last match, 98 avg insns/match - 0x4037fc, 'bl #0x4002b0', 18 hits, 1095 match hits, Δ+22 since last match, 98 avg insns/match - 0x400720, 'bl #0x403690', 10 hits, 1096 match hits, Δ+58 since last match, 98 avg insns/match - 0x4037fc, 'bl #0x4002b0', 19 hits, 1097 match hits, Δ+22 since last match, 98 avg insns/match - 0x400730, 'bl #0x403690', 10 hits, 1098 match hits, Δ+33 since last match, 98 avg insns/match - 0x4037ac, 'bl #0x4002b0', 12 hits, 1099 match hits, Δ+20 since last match, 98 avg insns/match + 0x4006ac, 'bl #0x403690', 10 hits, 1094 match hits, Δ+47 since last match, 98 avg insns/match + 0x4037fc, 'bl #0x4002b0', 18 hits, 1095 match hits, Δ+22 since last match, 98 avg insns/match + 0x400720, 'bl #0x403690', 10 hits, 1096 match hits, Δ+58 since last match, 98 avg insns/match + 0x4037fc, 'bl #0x4002b0', 19 hits, 1097 match hits, Δ+22 since last match, 98 avg insns/match + 0x400730, 'bl #0x403690', 10 hits, 1098 match hits, Δ+33 since last match, 98 avg insns/match + 0x4037ac, 'bl #0x4002b0', 12 hits, 1099 match hits, Δ+20 since last match, 98 avg insns/match ... For more detailed execution tracing see the ``execlog`` plugin for other options. -- tests/plugins/mem.c +- tests/plugin/mem.c Basic instruction level memory instrumentation:: @@ -291,8 +299,7 @@ Behaviour can be tweaked with the following arguments: * inline=true|false - Use faster inline addition of a single counter. Not per-cpu and not - thread safe. + Use faster inline addition of a single counter. * callback=true|false @@ -302,7 +309,7 @@ Behaviour can be tweaked with the following arguments: Count IO accesses (only for system emulation) -- tests/plugins/syscall.c +- tests/plugin/syscall.c A basic syscall tracing plugin. This only works for user-mode. By default it will give a summary of syscall stats at the end of the @@ -332,6 +339,11 @@ run:: 160 1 0 135 1 0 +- tests/plugins/inline.c + +This plugin is used for testing all inline operations, conditional callbacks and +scoreboard. It prints a per-cpu summary of all events. + - contrib/plugins/hotblocks.c The hotblocks plugin allows you to examine the where hot paths of @@ -342,9 +354,6 @@ with linux-user execution as system emulation tends to generate re-translations as blocks from different programs get swapped in and out of system memory. -If your program is single-threaded you can use the ``inline`` option for -slightly faster (but not thread safe) counters. - Example:: $ qemu-aarch64 \ @@ -462,7 +471,6 @@ people at roughly where execution diverges. The only argument you need for the plugin is a path for the socket the two instances will communicate over:: - $ qemu-system-sparc -monitor none -parallel none \ -net none -M SS-20 -m 256 -kernel day11/zImage.elf \ -plugin ./contrib/plugins/liblockstep.so,sockpath=lockstep-sparc.sock \ @@ -664,6 +672,23 @@ The plugin will log the reason of exit, for example:: 0xd4 reached, exiting +- contrib/plugins/ips.c + +This plugin can limit the number of Instructions Per Second that are executed:: + + # get number of instructions + $ num_insn=$(./build/qemu-x86_64 -plugin ./build/tests/plugin/libinsn.so -d plugin /bin/true |& grep total | sed -e 's/.*: //') + # limit speed to execute in 10 seconds + $ time ./build/qemu-x86_64 -plugin ./build/contrib/plugins/libips.so,ips=$(($num_insn/10)) /bin/true + real 10.000s + +Options: + + * ips=N + + Maximum number of instructions per cpu that can be executed in one second. + The plugin will sleep when the given number of instructions is reached. + Plugin API ========== -- 2.39.2