tcg: add test for qemu_plugin_set_pc API

Florian Hofhammer Thu, 26 Feb 2026 00:31:56 -0800

On 25/02/2026 18:30, Pierrick Bouvier wrote:
> On 2/25/26 8:21 AM, Florian Hofhammer wrote:
>> On 24/02/2026 16:52, Florian Hofhammer wrote:
>>> The test executes a non-existent syscall, which the syscall plugin
>>> intercepts and redirects to a clean exit.
>>> Due to architecture-specific quirks, the architecture-specific Makefiles
>>> require setting specific compiler and linker flags in some cases.
>>>
>>> Signed-off-by: Florian Hofhammer <[email protected]>
>>> ---
>>>   tests/tcg/arm/Makefile.target                 |  6 +++++
>>>   tests/tcg/hexagon/Makefile.target             |  7 +++++
>>>   tests/tcg/mips/Makefile.target                |  6 ++++-
>>>   tests/tcg/mips64/Makefile.target              | 15 +++++++++++
>>>   tests/tcg/mips64el/Makefile.target            | 15 +++++++++++
>>>   tests/tcg/mipsel/Makefile.target              | 15 +++++++++++
>>>   tests/tcg/multiarch/Makefile.target           | 22 ++++++++++++++--
>>>   .../{ => plugin}/check-plugin-output.sh       |  0
>>>   .../{ => plugin}/test-plugin-mem-access.c     |  0
>>>   .../plugin/test-plugin-skip-syscalls.c        | 26 +++++++++++++++++++
>>>   tests/tcg/plugins/syscall.c                   |  6 +++++
>>>   tests/tcg/sparc64/Makefile.target             | 16 ++++++++++++
>>>   12 files changed, 131 insertions(+), 3 deletions(-)
>>>   create mode 100644 tests/tcg/mips64/Makefile.target
>>>   create mode 100644 tests/tcg/mips64el/Makefile.target
>>>   create mode 100644 tests/tcg/mipsel/Makefile.target
>>>   rename tests/tcg/multiarch/{ => plugin}/check-plugin-output.sh (100%)
>>>   rename tests/tcg/multiarch/{ => plugin}/test-plugin-mem-access.c (100%)
>>>   create mode 100644 tests/tcg/multiarch/plugin/test-plugin-skip-syscalls.c
>>>   create mode 100644 tests/tcg/sparc64/Makefile.target
>>>
>>> diff --git a/tests/tcg/multiarch/plugin/test-plugin-skip-syscalls.c 
>>> b/tests/tcg/multiarch/plugin/test-plugin-skip-syscalls.c
>>> new file mode 100644
>>> index 0000000000..1f5cbc3851
>>> --- /dev/null
>>> +++ b/tests/tcg/multiarch/plugin/test-plugin-skip-syscalls.c
>>> @@ -0,0 +1,26 @@
>>> +/*
>>> + * SPDX-License-Identifier: GPL-2.0-or-later
>>> + *
>>> + * This test attempts to execute an invalid syscall. The syscall test 
>>> plugin
>>> + * should intercept this.
>>> + */
>>> +#include <stdint.h>
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <unistd.h>
>>> +
>>> +void exit_success(void) __attribute__((section(".redirect"), noinline,
>>> +                                       noreturn, used));
>>> +
>>> +void exit_success(void) {
>>> +    _exit(EXIT_SUCCESS);
>>> +}
>>> +
>>> +int main(int argc, char *argv[]) {
>>> +    long ret = syscall(0xc0deUL);
>>> +    if (ret != 0L) {
>>> +        perror("");
>>> +    }
>>> +    /* We should never get here */
>>> +    return EXIT_FAILURE;
>>> +}
>>
>> I'm running into an issue for all four variants of MIPS if I don't
>> hardcode the section but pass the function address as a syscall argument
>> and then use that as jump target in the plugin: according to the ABI,
>> the t9 register has to contain the address of the function being called.
>> The function prologue then calculates the gp register value (global
>> pointer / context pointer) based on t9, and derives the new values of t9
>> for any callees from gp again. As I'm currently just updating the pc
>> with the new API function, t9 is out of sync with the code after control
>> flow redirection and the binary crashes.
>>> I think it is fair to expect a user of the API to be aware of such
>> pitfalls (or we can document it), but I'd of course still like to make
>> the tests pass. The simplest solution (theoretically) is to also set the
>> t9 register in the plugin callback before calling qemu_plugin_set_pc.
>> However, the MIPS targets do not actually expose any registers to
>> plugins, i.e., qemu_plugin_get_registers returns an empty GArray.
>>
> 
> A lot of things can go wrong when jumping to a different context than the 
> current one, that's why setjmp/longjmp exist. You can always add a comment 
> for this "This function only changes pc and does not guarantee other 
> registers representing context will have a proper value when updating it".
> 
> A reason why I pushed for using value labels, was to stay in the same 
> context, and avoid the kind of issues you ran into.
> 
>> Given this behavior, I see two solutions:
>> 1) skipping the test on MIPS, or
>> 2) making the test code a bit more contrived to use labels within the
>>     same function while preventing the compiler from optimizing the
>>     labels away (which it does even with -O0). I've got a prototype for
>>     this, but the test code looks a bit contrived then:
>>
>>     int main(int argc, char *argv[]) {
>>         int retvals[] = {EXIT_SUCCESS, EXIT_FAILURE};
>>         int retval_idx = 1;
>>
>>         long ret = syscall(0xc0deUL, &&good);
>>         if (ret < 0) {
>>             perror("");
>>             goto bad;
>>         } else if (ret == 0xdeadbeefUL) {
>>             /*
>>              * Should never arrive here but we need this nevertheless to 
>> prevent
>>              * the compiler from optimizing away the label. Otherwise, the 
>> compiler
>>              * silently rewrites the label value used in the syscall to 
>> another
>>              * address (typically pointing to right after the function 
>> prologue).
>>              */
>>             printf("Check what's wrong, we should never arrive here!");
>>             assert(((uintptr_t)&&good == (uintptr_t)ret));
>>             /* We should absolutely never arrive here, the assert should 
>> trigger */
>>             goto good;
>>         }
>>
>>     bad:
>>         retval_idx = 1;
>>         goto exit;
>>     good:
>>         retval_idx = 0;
>>     exit:
>>         return retvals[retval_idx];
>>     }
>>
>> Maybe I'm just missing something obvious, I'd be happy to get some
>> feedback on this. Thanks in advance!
>>
> 
> Can you post the exact code you had where labels are optimized away?
> On which arch was it?


This happens across architectures but I've verified with x86, aarch64,
and riscv64. Below the C code and generated assembly, compiled with gcc
-O0 -o test.s -S test.c (adding -fno-dce and -fno-tree-dce does not
change anything and they should be included in -O0 anyway):

test.c:
    #include <stdlib.h>
    #include <unistd.h>

    int main(int argc, char *argv[]) {

        syscall(4096, &&good);
        return EXIT_FAILURE;
    good:
        return EXIT_SUCCESS;

    }

test.s:
        .file   "test.c"
        .text
        .globl  main
        .type   main, @function
    main:
    .LFB6:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        subq    $16, %rsp
        movl    %edi, -4(%rbp)
        movq    %rsi, -16(%rbp)
    .L2:
        leaq    .L2(%rip), %rax
        movq    %rax, %rsi
        movl    $4096, %edi
        movl    $0, %eax
        call    syscall@PLT
        movl    $1, %eax
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
    .LFE6:
        .size   main, .-main
        .ident  "GCC: (GNU) 15.2.1 20260209"
        .section        .note.GNU-stack,"",@progbits

As you can see, the "good" label and correspondingly the second return
are optimized away, and the compiler replaces the syscall argument
&&good with the address of the .L2 assembly label. This effectively then
causes an infinite loop of the syscall being called over and over again
and redirecting the PC to right after the function prologue.
The assembly output for aarch64 and riscv64 is analogous.

> I tried something similar but didn't see the second one disappear.
> 
> I wonder if it's related to compiler detecting g_assert_not_reached() is a 
> "noreturn" function, thus it doesn't expect to go past it, and eliminates 
> dead code. You can try with assert(0) or moving g_assert_not_reached() to 
> another function "crash()" instead, that is not marked as noreturn.

As I mentioned above, even turning off dead code elimination didn't
resolve the issue. But this was actually a good pointer nevertheless:
just wrapping the exit in a separate function like below solves the
issue. It's just that the compiler detected that the label is after a
return and deemed it unreachable (obviously not knowing about the
semantics of the syscall).  Below is the modified code that works across
architectures and that I'll use in the end:

    void exit_failure(void) {
        _exit(EXIT_FAILURE);
    }

    void exit_success(void) {
        _exit(EXIT_SUCCESS);
    }

    int main(int argc, char *argv[]) {
        syscall(4096, &&good);
        exit_failure();
    good:
        exit_success();
    }

> 
>> Best regards,
>> Florian
> 
> Regards,
> Pierrick

smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH v4 4/7] tests/tcg: add test for qemu_plugin_set_pc API

Reply via email to