On 7/28/23 06:29, Claudio Fontana wrote:
On 7/27/23 19:41, Richard Henderson wrote:
On 7/21/23 02:08, Claudio Fontana wrote:
Thread 3 "qemu-system-s39" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff53516c0 (LWP 215975)]
(gdb) bt
#0  0x00007ffff730dabc in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x00007ffff72bc266 in raise () at /lib64/libc.so.6
#2  0x00007ffff72a4897 in abort () at /lib64/libc.so.6
#3  0x00007ffff76f0eee in  () at /lib64/libglib-2.0.so.0
#4  0x00007ffff775649a in g_assertion_message_expr () at /lib64/libglib-2.0.so.0
#5  0x0000555555b96134 in page_unlock__debug (pd=0x7ffee8680440) at 
../accel/tcg/tb-maint.c:348
#6  0x0000555555b962a9 in page_unlock (pd=0x7ffee8680440) at 
../accel/tcg/tb-maint.c:397
#7  0x0000555555b96580 in tb_unlock_pages (tb=0x7fffefffeb00) at 
../accel/tcg/tb-maint.c:483
#8  0x0000555555b94698 in cpu_exec_longjmp_cleanup (cpu=0x555556566a30) at 
../accel/tcg/cpu-exec.c:556


https://patchew.org/QEMU/20230726201330.357175-1-richard.hender...@linaro.org/


r~

Hi Richard,

I applied your patch, however I still encounter an assert:

ERROR:../accel/tcg/tb-maint.c:367:assert_no_pages_locked: assertion failed: 
(g_hash_table_size(ht_pages_locked_debug) == 0)
Bail out! ERROR:../accel/tcg/tb-maint.c:367:assert_no_pages_locked: assertion 
failed: (g_hash_table_size(ht_pages_locked_debug) == 0)


Ok, this is a different problem.  And tricky...




Thread 6 "qemu-system-s39" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffeef5fe6c0 (LWP 116343)]
0x00007ffff730dabc in __pthread_kill_implementation () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff730dabc in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x00007ffff72bc266 in raise () at /lib64/libc.so.6
#2  0x00007ffff72a4897 in abort () at /lib64/libc.so.6
#3  0x00007ffff76f0eee in  () at /lib64/libglib-2.0.so.0
#4  0x00007ffff775649a in g_assertion_message_expr () at /lib64/libglib-2.0.so.0
#5  0x0000555555b96f82 in assert_no_pages_locked () at 
../accel/tcg/tb-maint.c:367
#6  0x0000555555b976cc in page_collection_lock (start=6674, last=6674) at 
../accel/tcg/tb-maint.c:614
#7  0x0000555555b9877c in tb_invalidate_phys_range (start=27336872, 
last=27336879) at ../accel/tcg/tb-maint.c:1197
#8  0x0000555555b6b25e in invalidate_and_set_dirty (mr=0x5555563f6e90, 
addr=27336872, length=8) at ../softmmu/physmem.c:2542
#9  0x0000555555b6d72d in address_space_stq_internal
     (as=0x5555566b7350, addr=27336872, val=2930044561408, attrs=..., 
result=0x0, endian=DEVICE_NATIVE_ENDIAN)
     at /root/git/qemu/memory_ldst.c.inc:495
#10 0x0000555555b6d7aa in address_space_stq (as=0x5555566b7350, addr=27336872, 
val=2930044561408, attrs=..., result=0x0)
     at /root/git/qemu/memory_ldst.c.inc:510
#11 0x0000555555a9fff6 in stq_phys (as=0x5555566b7350, addr=27336872, 
val=2930044561408)
     at /root/git/qemu/include/exec/memory_ldst_phys.h.inc:55
#12 0x0000555555aa0630 in s390_cpu_tlb_fill
     (cs=0x555556663c80, address=2930044559360, size=1, 
access_type=MMU_INST_FETCH, mmu_idx=0, probe=false, retaddr=0)
     at ../target/s390x/tcg/excp_helper.c:194
#13 0x0000555555ba8a89 in probe_access_internal
     (env=0x555556666460, addr=2930044559360, fault_size=1, 
access_type=MMU_INST_FETCH, mmu_idx=0, nonfault=false, phost=0x7ffeef5fcfd0, 
pfu\
ll=0x7ffeef5fcfc8, retaddr=0, check_mem_cbs=false) at ../accel/tcg/cputlb.c:1530
#14 0x0000555555ba90f0 in get_page_addr_code_hostp (env=0x555556666460, 
addr=2930044559360, hostp=0x7ffeef5fd2f0)
     at ../accel/tcg/cputlb.c:1695
#15 0x0000555555ba122d in translator_access (env=0x555556666460, 
db=0x7ffeef5fd2c0, pc=2930044559360, len=4)
     at ../accel/tcg/translator.c:257
#16 0x0000555555ba15e2 in translator_ldl (env=0x555556666460, 
db=0x7ffeef5fd2c0, pc=2930044559360) at ../accel/tcg/translator.c:351

#16: load for translation,
#15: translation for next page
#12: tlb_fill for next page
#11: store, updating access bit on the PTE
#8: invalidate the page table page, which was also marked code?!?
#5: assert no pages locked -- we never expected to invalidate in this context.

It's the page containing both code and a page table entry that concerns me. It seems like a kernel bug, though obviously we shouldn't crash. I'm not sure what to do about it.


r~

Reply via email to