Re: 'make check-tcg' fails with an assert in qemu_plugin_vcpu_init_hook

2023-06-21 Thread Peter Maydell
On Wed, 21 Jun 2023 at 11:00, Alex Bennée  wrote:
>
>
> Peter Maydell  writes:
>
> > On Wed, 21 Jun 2023 at 09:05, Alex Bennée  wrote:
> >>   - I suspect the plugin core stuff could be build once (or maybe twice,
> >> system and user)
> >
> > It is already build-once, that's why it goes wrong...
>
> I thought it was the other way around:
>
>   specific_ss.add(when: 'CONFIG_PLUGIN', if_true: [files(
> 'loader.c',
> 'core.c',
> 'api.c',
>   ), declare_dependency(link_args: plugin_ldflags)])
>
> but if we built it for linux-user and softmmu this could be fixed (until
> the next breakage anyway). cpus-common.c is the common code that sets
> this once.

Oh, right, I got it the wrong way around.

> >>   - we need to have some guard rails somehow to make sure things don't
> >> go out of sync
> >
> > We do, this is the poison.h stuff. CONFIG_USER_ONLY is a
> > special case which we don't poison because there would be
> > too much refactoring required...
>
> I guess a great big honking comment at the top of CPUState telling
> people not to do that or pushing softmmu and user specific bits of
> CPUState into their own de-referenced structures.

It's not specific to CPUState, though. The thing you must not
do is use CONFIG_USER_ONLY (or CONFIG_SOFTMMU, now) to ifdef
out any struct field anywhere in any struct that's visible to
compiled-once code, or otherwise do something that changes
the ABI of a global or of a type passed around between functions.

-- PMM



Re: 'make check-tcg' fails with an assert in qemu_plugin_vcpu_init_hook

2023-06-21 Thread Alex Bennée


Peter Maydell  writes:

> On Wed, 21 Jun 2023 at 09:05, Alex Bennée  wrote:
>>
>>
>> Peter Maydell  writes:
>>
>> > On Tue, 20 Jun 2023 at 17:56, Peter Maydell  
>> > wrote:
>> >>
>> >> $ make -C build/x86 check-tcg
>> >> make: Entering directory 
>> >> '/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86'
>> >> [...]
>> >>   TESTmunmap-pthread on arm
>> >> **
>> >> ERROR:../../plugins/core.c:221:qemu_plugin_vcpu_init_hook: assertion
>> >> failed: (success)
>> >> **
>> >> ERROR:../../accel/tcg/cpu-exec.c:1024:cpu_exec_setjmp: assertion
>> >> failed: (cpu == current_cpu)
>> >
>> > git bisect blames commit d7ee93e2435970:
>> >
>> > cputlb: Restrict SavedIOTLB to system emulation
>> >
>> > I think that commit is not correct, because it means that
>> > the size of 'struct CPUState' and also the offset of fields
>> > like 'cpu_index' will be different for files which are
>> > compile-per-target-for-usermode and files which are
>> > compile-once-only. The assert happens here because the
>> > code which sets up cpu_index is build-once, but the code
>> > in qemu_plugin_vcpu_init_hook() which reads cpu_index is
>> > build-per-target and now they don't agree about where in
>> > the struct the field is...
>>
>> Hmm two things from that imply:
>>
>>   - I suspect the plugin core stuff could be build once (or maybe twice,
>> system and user)
>
> It is already build-once, that's why it goes wrong...

I thought it was the other way around:

  specific_ss.add(when: 'CONFIG_PLUGIN', if_true: [files(
'loader.c',
'core.c',
'api.c',
  ), declare_dependency(link_args: plugin_ldflags)])

but if we built it for linux-user and softmmu this could be fixed (until
the next breakage anyway). cpus-common.c is the common code that sets
this once.

>>   - we need to have some guard rails somehow to make sure things don't
>> go out of sync
>
> We do, this is the poison.h stuff. CONFIG_USER_ONLY is a
> special case which we don't poison because there would be
> too much refactoring required...

I guess a great big honking comment at the top of CPUState telling
people not to do that or pushing softmmu and user specific bits of
CPUState into their own de-referenced structures.

>
> -- PMM


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: 'make check-tcg' fails with an assert in qemu_plugin_vcpu_init_hook

2023-06-21 Thread Philippe Mathieu-Daudé

On 21/6/23 10:53, Peter Maydell wrote:

On Wed, 21 Jun 2023 at 09:05, Alex Bennée  wrote:



Peter Maydell  writes:


On Tue, 20 Jun 2023 at 17:56, Peter Maydell  wrote:


$ make -C build/x86 check-tcg
make: Entering directory '/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86'
[...]
   TESTmunmap-pthread on arm
**
ERROR:../../plugins/core.c:221:qemu_plugin_vcpu_init_hook: assertion
failed: (success)
**
ERROR:../../accel/tcg/cpu-exec.c:1024:cpu_exec_setjmp: assertion
failed: (cpu == current_cpu)


git bisect blames commit d7ee93e2435970:

 cputlb: Restrict SavedIOTLB to system emulation

I think that commit is not correct, because it means that
the size of 'struct CPUState' and also the offset of fields
like 'cpu_index' will be different for files which are
compile-per-target-for-usermode and files which are
compile-once-only. The assert happens here because the
code which sets up cpu_index is build-once, but the code
in qemu_plugin_vcpu_init_hook() which reads cpu_index is
build-per-target and now they don't agree about where in
the struct the field is...


Hmm two things from that imply:

   - I suspect the plugin core stuff could be build once (or maybe twice,
 system and user)


It is already build-once, that's why it goes wrong...


   - we need to have some guard rails somehow to make sure things don't
 go out of sync


We do, this is the poison.h stuff. CONFIG_USER_ONLY is a
special case which we don't poison because there would be
too much refactoring required...


Just FYI the goal of the series including this commit is to remove
"exec/hwaddr.h" from user emulation.




Re: 'make check-tcg' fails with an assert in qemu_plugin_vcpu_init_hook

2023-06-21 Thread Peter Maydell
On Wed, 21 Jun 2023 at 09:05, Alex Bennée  wrote:
>
>
> Peter Maydell  writes:
>
> > On Tue, 20 Jun 2023 at 17:56, Peter Maydell  
> > wrote:
> >>
> >> $ make -C build/x86 check-tcg
> >> make: Entering directory 
> >> '/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86'
> >> [...]
> >>   TESTmunmap-pthread on arm
> >> **
> >> ERROR:../../plugins/core.c:221:qemu_plugin_vcpu_init_hook: assertion
> >> failed: (success)
> >> **
> >> ERROR:../../accel/tcg/cpu-exec.c:1024:cpu_exec_setjmp: assertion
> >> failed: (cpu == current_cpu)
> >
> > git bisect blames commit d7ee93e2435970:
> >
> > cputlb: Restrict SavedIOTLB to system emulation
> >
> > I think that commit is not correct, because it means that
> > the size of 'struct CPUState' and also the offset of fields
> > like 'cpu_index' will be different for files which are
> > compile-per-target-for-usermode and files which are
> > compile-once-only. The assert happens here because the
> > code which sets up cpu_index is build-once, but the code
> > in qemu_plugin_vcpu_init_hook() which reads cpu_index is
> > build-per-target and now they don't agree about where in
> > the struct the field is...
>
> Hmm two things from that imply:
>
>   - I suspect the plugin core stuff could be build once (or maybe twice,
> system and user)

It is already build-once, that's why it goes wrong...

>   - we need to have some guard rails somehow to make sure things don't
> go out of sync

We do, this is the poison.h stuff. CONFIG_USER_ONLY is a
special case which we don't poison because there would be
too much refactoring required...

-- PMM



Re: 'make check-tcg' fails with an assert in qemu_plugin_vcpu_init_hook

2023-06-21 Thread Alex Bennée


Peter Maydell  writes:

> On Tue, 20 Jun 2023 at 17:56, Peter Maydell  wrote:
>>
>> $ make -C build/x86 check-tcg
>> make: Entering directory 
>> '/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86'
>> [...]
>>   TESTmunmap-pthread on arm
>> **
>> ERROR:../../plugins/core.c:221:qemu_plugin_vcpu_init_hook: assertion
>> failed: (success)
>> **
>> ERROR:../../accel/tcg/cpu-exec.c:1024:cpu_exec_setjmp: assertion
>> failed: (cpu == current_cpu)
>
> git bisect blames commit d7ee93e2435970:
>
> cputlb: Restrict SavedIOTLB to system emulation
>
> I think that commit is not correct, because it means that
> the size of 'struct CPUState' and also the offset of fields
> like 'cpu_index' will be different for files which are
> compile-per-target-for-usermode and files which are
> compile-once-only. The assert happens here because the
> code which sets up cpu_index is build-once, but the code
> in qemu_plugin_vcpu_init_hook() which reads cpu_index is
> build-per-target and now they don't agree about where in
> the struct the field is...

Hmm two things from that imply:

  - I suspect the plugin core stuff could be build once (or maybe twice,
system and user)

  - we need to have some guard rails somehow to make sure things don't
go out of sync

>
> Reverting the commit fixes the bug.
>
> thanks
> -- PMM


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: 'make check-tcg' fails with an assert in qemu_plugin_vcpu_init_hook

2023-06-20 Thread Peter Maydell
On Tue, 20 Jun 2023 at 17:56, Peter Maydell  wrote:
>
> $ make -C build/x86 check-tcg
> make: Entering directory 
> '/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86'
> [...]
>   TESTmunmap-pthread on arm
> **
> ERROR:../../plugins/core.c:221:qemu_plugin_vcpu_init_hook: assertion
> failed: (success)
> **
> ERROR:../../accel/tcg/cpu-exec.c:1024:cpu_exec_setjmp: assertion
> failed: (cpu == current_cpu)

git bisect blames commit d7ee93e2435970:

cputlb: Restrict SavedIOTLB to system emulation

I think that commit is not correct, because it means that
the size of 'struct CPUState' and also the offset of fields
like 'cpu_index' will be different for files which are
compile-per-target-for-usermode and files which are
compile-once-only. The assert happens here because the
code which sets up cpu_index is build-once, but the code
in qemu_plugin_vcpu_init_hook() which reads cpu_index is
build-per-target and now they don't agree about where in
the struct the field is...

Reverting the commit fixes the bug.

thanks
-- PMM



Re: 'make check-tcg' fails with an assert in qemu_plugin_vcpu_init_hook

2023-06-20 Thread Richard Henderson

On 6/20/23 18:56, Peter Maydell wrote:

ERROR:../../accel/tcg/cpu-exec.c:1024:cpu_exec_setjmp: assertion
failed: (cpu == current_cpu)

...

The assertion in cpu-exec.c is interesting too and may or
may not be relevant.


FWIW, the last time I saw this the stack had been clobbered and the saved value of "cpu" 
was garbage.  There is very very little in cpu_exec_setjmp() by design.



r~



'make check-tcg' fails with an assert in qemu_plugin_vcpu_init_hook

2023-06-20 Thread Peter Maydell
$ make -C build/x86 check-tcg
make: Entering directory '/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86'
[...]
  TESTmunmap-pthread on arm
**
ERROR:../../plugins/core.c:221:qemu_plugin_vcpu_init_hook: assertion
failed: (success)
**
ERROR:../../accel/tcg/cpu-exec.c:1024:cpu_exec_setjmp: assertion
failed: (cpu == current_cpu)

Here's the backtrace:

#0  __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140737332028096) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737332028096) at
./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737332028096,
signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x76fc1476 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4  0x76fa77f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x77497b57 in g_assertion_message (domain=,
file=, line=,
func=0x55800d50 <__func__.3> "qemu_plugin_vcpu_init_hook",
message=) at ../../../glib/gtestutils.c:3253
#6  0x774f170f in g_assertion_message_expr (domain=0x0,
file=0x55800ccf "../../plugins/core.c", line=221,
func=0x55800d50 <__func__.3> "qemu_plugin_vcpu_init_hook",
expr=) at ../../../glib/gtestutils.c:3279
#7  0x556e5747 in qemu_plugin_vcpu_init_hook
(cpu=0x559d0450) at ../../plugins/core.c:221
#8  0x556a9cc3 in cpu_exec_realizefn (cpu=0x559d0450,
errp=0x7fffc630) at ../../cpu.c:153
#9  0x555a44ef in arm_cpu_realizefn (dev=0x559d0450,
errp=0x7fffc780) at ../../target/arm/cpu.c:1673
#10 0x5572ef2e in device_set_realized (obj=0x559d0450,
value=true, errp=0x7fffc9b8) at ../../hw/core/qdev.c:510
#11 0x5573931b in property_set_bool (obj=0x559d0450,
v=0x559c0d40, name=0x5580ef41 "realized",
opaque=0x55924870,
errp=0x7fffc9b8) at ../../qom/object.c:2285
#12 0x55737212 in object_property_set (obj=0x559d0450,
name=0x5580ef41 "realized", v=0x559c0d40, errp=0x7fffc9b8)
at ../../qom/object.c:1420
#13 0x5573b861 in object_property_set_qobject
(obj=0x559d0450, name=0x5580ef41 "realized",
value=0x5592bc90, errp=0x7fffc9b8)
at ../../qom/qom-qobject.c:28
#14 0x55737591 in object_property_set_bool
(obj=0x559d0450, name=0x5580ef41 "realized", value=true,
errp=0x7fffc9b8)
at ../../qom/object.c:1489
#15 0x5572e6bc in qdev_realize (dev=0x559d0450, bus=0x0,
errp=0x7fffc9b8) at ../../hw/core/qdev.c:292
#16 0x5559a65c in cpu_create (typename=0x5591c5c0
"max-arm-cpu") at ../../hw/core/cpu-common.c:61
#17 0x556f1712 in cpu_copy (env=0x55953d80) at
../../linux-user/main.c:231
#18 0x55711c4e in do_fork (env=0x55953d80, flags=4001536,
newsp=1073734008, parent_tidptr=1073735528, newtls=1073736832,
child_tidptr=1073735528) at ../../linux-user/syscall.c:6672
#19 0x5571cdea in do_syscall1 (cpu_env=0x55953d80,
num=120, arg1=4001536, arg2=1073734008, arg3=1073735528,
arg4=1073736832,
arg5=1073735528, arg6=1082129932, arg7=0, arg8=0) at
../../linux-user/syscall.c:10869
#20 0x557243f2 in do_syscall (cpu_env=0x55953d80, num=120,
arg1=4001536, arg2=1073734008, arg3=1073735528, arg4=1073736832,
arg5=1073735528, arg6=1082129932, arg7=0, arg8=0) at
../../linux-user/syscall.c:13610
#21 0x555a1616 in cpu_loop (env=0x55953d80) at
../../linux-user/arm/cpu_loop.c:434
#22 0x556f2ee0 in main (argc=2, argv=0x7fffde68,
envp=0x7fffde80) at ../../linux-user/main.c:973

AFAICT this is happening because we try to insert an entry
into the plugin.cpu_ht hash table whose key is cpu->cpu_index.
But in this "new thread" codepath, the new thread's
cpu_index is 0, which is the same as the old thread's
cpu_index...

The assertion in cpu-exec.c is interesting too and may or
may not be relevant.

thanks
-- PMM