On 7/28/23 06:29, Claudio Fontana wrote:
On 7/27/23 19:41, Richard Henderson wrote:
On 7/21/23 02:08, Claudio Fontana wrote:
Thread 3 "qemu-system-s39" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff53516c0 (LWP 215975)]
(gdb) bt
#0 0x00007ffff730dabc in __pthread_kill_implementation () at /lib64/libc.so.6
#1 0x00007ffff72bc266 in raise () at /lib64/libc.so.6
#2 0x00007ffff72a4897 in abort () at /lib64/libc.so.6
#3 0x00007ffff76f0eee in () at /lib64/libglib-2.0.so.0
#4 0x00007ffff775649a in g_assertion_message_expr () at /lib64/libglib-2.0.so.0
#5 0x0000555555b96134 in page_unlock__debug (pd=0x7ffee8680440) at
../accel/tcg/tb-maint.c:348
#6 0x0000555555b962a9 in page_unlock (pd=0x7ffee8680440) at
../accel/tcg/tb-maint.c:397
#7 0x0000555555b96580 in tb_unlock_pages (tb=0x7fffefffeb00) at
../accel/tcg/tb-maint.c:483
#8 0x0000555555b94698 in cpu_exec_longjmp_cleanup (cpu=0x555556566a30) at
../accel/tcg/cpu-exec.c:556
https://patchew.org/QEMU/20230726201330.357175-1-richard.hender...@linaro.org/
r~
Hi Richard,
I applied your patch, however I still encounter an assert:
ERROR:../accel/tcg/tb-maint.c:367:assert_no_pages_locked: assertion failed:
(g_hash_table_size(ht_pages_locked_debug) == 0)
Bail out! ERROR:../accel/tcg/tb-maint.c:367:assert_no_pages_locked: assertion
failed: (g_hash_table_size(ht_pages_locked_debug) == 0)
Ok, this is a different problem. And tricky...
Thread 6 "qemu-system-s39" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffeef5fe6c0 (LWP 116343)]
0x00007ffff730dabc in __pthread_kill_implementation () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff730dabc in __pthread_kill_implementation () at /lib64/libc.so.6
#1 0x00007ffff72bc266 in raise () at /lib64/libc.so.6
#2 0x00007ffff72a4897 in abort () at /lib64/libc.so.6
#3 0x00007ffff76f0eee in () at /lib64/libglib-2.0.so.0
#4 0x00007ffff775649a in g_assertion_message_expr () at /lib64/libglib-2.0.so.0
#5 0x0000555555b96f82 in assert_no_pages_locked () at
../accel/tcg/tb-maint.c:367
#6 0x0000555555b976cc in page_collection_lock (start=6674, last=6674) at
../accel/tcg/tb-maint.c:614
#7 0x0000555555b9877c in tb_invalidate_phys_range (start=27336872,
last=27336879) at ../accel/tcg/tb-maint.c:1197
#8 0x0000555555b6b25e in invalidate_and_set_dirty (mr=0x5555563f6e90,
addr=27336872, length=8) at ../softmmu/physmem.c:2542
#9 0x0000555555b6d72d in address_space_stq_internal
(as=0x5555566b7350, addr=27336872, val=2930044561408, attrs=...,
result=0x0, endian=DEVICE_NATIVE_ENDIAN)
at /root/git/qemu/memory_ldst.c.inc:495
#10 0x0000555555b6d7aa in address_space_stq (as=0x5555566b7350, addr=27336872,
val=2930044561408, attrs=..., result=0x0)
at /root/git/qemu/memory_ldst.c.inc:510
#11 0x0000555555a9fff6 in stq_phys (as=0x5555566b7350, addr=27336872,
val=2930044561408)
at /root/git/qemu/include/exec/memory_ldst_phys.h.inc:55
#12 0x0000555555aa0630 in s390_cpu_tlb_fill
(cs=0x555556663c80, address=2930044559360, size=1,
access_type=MMU_INST_FETCH, mmu_idx=0, probe=false, retaddr=0)
at ../target/s390x/tcg/excp_helper.c:194
#13 0x0000555555ba8a89 in probe_access_internal
(env=0x555556666460, addr=2930044559360, fault_size=1,
access_type=MMU_INST_FETCH, mmu_idx=0, nonfault=false, phost=0x7ffeef5fcfd0,
pfu\
ll=0x7ffeef5fcfc8, retaddr=0, check_mem_cbs=false) at ../accel/tcg/cputlb.c:1530
#14 0x0000555555ba90f0 in get_page_addr_code_hostp (env=0x555556666460,
addr=2930044559360, hostp=0x7ffeef5fd2f0)
at ../accel/tcg/cputlb.c:1695
#15 0x0000555555ba122d in translator_access (env=0x555556666460,
db=0x7ffeef5fd2c0, pc=2930044559360, len=4)
at ../accel/tcg/translator.c:257
#16 0x0000555555ba15e2 in translator_ldl (env=0x555556666460,
db=0x7ffeef5fd2c0, pc=2930044559360) at ../accel/tcg/translator.c:351
#16: load for translation,
#15: translation for next page
#12: tlb_fill for next page
#11: store, updating access bit on the PTE
#8: invalidate the page table page, which was also marked code?!?
#5: assert no pages locked -- we never expected to invalidate in this context.
It's the page containing both code and a page table entry that concerns me. It seems like
a kernel bug, though obviously we shouldn't crash. I'm not sure what to do about it.
r~