Alex Bennée <alex.ben...@linaro.org> writes: > Pranith Kumar <bobby.pr...@gmail.com> writes: > >> Hi Alex, >> >> On Tue, Jan 12, 2016 at 12:29 PM, Alex Bennée <alex.ben...@linaro.org> >> wrote: >> >>> >> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks >>> >> >> I built this branch and ran an arm64 guest. It seems to be failing >> similarly to what I reported earlier: >> >> #0 0x00007ffff2211cc9 in __GI_raise (sig=sig@entry=6) at >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >> #1 0x00007ffff22150d8 in __GI_abort () at abort.c:89 >> #2 0x000055555572014c in qemu_ram_addr_from_host_nofail >> (ptr=0xffffffc000187863) at /home/pranith/devops/code/qemu/cputlb.c:357 >> #3 0x00005555557209dd in get_page_addr_code (env1=0x555556702058, >> addr=18446743798833248356) at /home/pranith/devops/code/qemu/cputlb.c:568 >> #4 0x00005555556db98c in tb_find_physical (cpu=0x5555566f9dd0, >> pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at >> /home/pranith/devops/code/qemu/cpu-exec.c:224 >> #5 0x00005555556dbaf4 in tb_find_slow (cpu=0x5555566f9dd0, >> pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at >> /home/pranith/devops/code/qemu/cpu-exec.c:268 >> #6 0x00005555556dbc77 in tb_find_fast (cpu=0x5555566f9dd0) at >> /home/pranith/devops/code/qemu/cpu-exec.c:311 >> #7 0x00005555556dc0f1 in cpu_arm_exec (cpu=0x5555566f9dd0) at >> /home/pranith/devops/code/qemu/cpu-exec.c:492 >> #8 0x00005555557050ee in tcg_cpu_exec (cpu=0x5555566f9dd0) at >> /home/pranith/devops/code/qemu/cpus.c:1486 >> #9 0x00005555557051af in tcg_exec_all (cpu=0x5555566f9dd0) at >> /home/pranith/devops/code/qemu/cpus.c:1515 >> #10 0x0000555555704800 in qemu_tcg_cpu_thread_fn (arg=0x5555566f9dd0) at >> /home/pranith/devops/code/qemu/cpus.c:1187 >> #11 0x00007ffff25a8182 in start_thread (arg=0x7fffd20c8700) at >> pthread_create.c:312 >> #12 0x00007ffff22d547d in clone () at >> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 <snip>
Having seen a backtrace of a crash while the other thread was flushing the TLB entries I sprinkled a bunch of: g_assert(cpu == current_cpu); In all public functions in cputlb that took a CPU. There are a bunch of cases that don't defer actions across CPUs which need to be fixed up. I suspect they don't hit in the arm case because the type of TLB flushing pattern is different. In aarch64 it my backtrace it was triggered by tlbi_aa64_vae1is_write: 7 Thread 0x7ffe777fe700 (LWP 32705) "worker" sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101 6 Thread 0x7ffe77fff700 (LWP 32704) "worker" sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101 5 Thread 0x7fff8d9d0700 (LWP 32703) "CPU 1/TCG" 0x000055555572cc18 in memcpy (__len=8, __src=<synthetic pointer>, __dest=<optimised out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:51 * 4 Thread 0x7fff8e1d1700 (LWP 32702) "CPU 0/TCG" memset () at ../sysdeps/x86_64/memset.S:94 3 Thread 0x7fff8f1cb700 (LWP 32701) "worker" sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101 2 Thread 0x7fffe45c8700 (LWP 32700) "qemu-system-aar" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 1 Thread 0x7ffff7f98c00 (LWP 32696) "qemu-system-aar" 0x00007ffff0ba01ef in __GI_ppoll (fds=0x5555575cb5b0, nfds=8, timeout=<optimised out>, timeout@entry=0x7fffffffdf60, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:56 #0 memset () at ../sysdeps/x86_64/memset.S:94 #1 0x0000555555728bee in memset (__len=32768, __ch=0, __dest=0x555556632568) at /usr/include/x86_64-linux-gnu/bits/string3.h:84 #2 v_tlb_flush_by_mmuidx (argp=0x7fff8e1d0430, cpu=0x555556632380) at /home/alex/lsrc/qemu/qemu.git/cputlb.c:136 #3 tlb_flush_page_by_mmuidx (cpu=cpu@entry=0x555556632380, addr=addr@entry=547976253440) at /home/alex/lsrc/qemu/qemu.git/cputlb.c:243 #4 0x00005555557fcb4a in tlbi_aa64_vae1is_write (env=<optimised out>, ri=<optimised out>, value=<optimised out>) at /home/alex/lsrc/qemu/qemu.git/target-arm/helper.c:2757 #5 0x00007fffa441dac5 in code_gen_buffer () #6 0x00005555556eef4b in cpu_tb_exec (tb_ptr=<optimised out>, cpu=0x5555565eddd0) at /home/alex/lsrc/qemu/qemu.git/cpu-exec.c:157 #7 cpu_arm_exec (cpu=cpu@entry=0x5555565eddd0) at /home/alex/lsrc/qemu/qemu.git/cpu-exec.c:520 #8 0x00005555557108e8 in tcg_cpu_exec (cpu=0x5555565eddd0) at /home/alex/lsrc/qemu/qemu.git/cpus.c:1486 #9 tcg_exec_all (cpu=0x5555565eddd0) at /home/alex/lsrc/qemu/qemu.git/cpus.c:1515 #10 qemu_tcg_cpu_thread_fn (arg=0x5555565eddd0) at /home/alex/lsrc/qemu/qemu.git/cpus.c:1187 #11 0x00007ffff0e80182 in start_thread (arg=0x7fff8e1d1700) at pthread_create.c:312 #12 0x00007ffff0bad47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 [Switching to thread 5 (Thread 0x7fff8d9d0700 (LWP 32703))] #0 0x000055555572cc18 in memcpy (__len=8, __src=<synthetic pointer>, __dest=<optimised out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:51 51 return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest)); #0 0x000055555572cc18 in memcpy (__len=8, __src=<synthetic pointer>, __dest=<optimised out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:51 #1 stq_he_p (v=<optimised out>, ptr=<optimised out>) at /home/alex/lsrc/qemu/qemu.git/include/qemu/bswap.h:292 #2 stq_le_p (v=547973099520, ptr=<optimised out>) at /home/alex/lsrc/qemu/qemu.git/include/qemu/bswap.h:327 #3 helper_le_stq_mmu (env=0x55555663a608, addr=18446743801961580216, val=547973099520, oi=<optimised out>, retaddr=140735948385557) at /home/alex/lsrc/qemu/qemu.git/softmmu_template.h:455 #4 0x00007fffa435ed17 in code_gen_buffer () #5 0x00005555556eef4b in cpu_tb_exec (tb_ptr=<optimised out>, cpu=0x555556632380) at /home/alex/lsrc/qemu/qemu.git/cpu-exec.c:157 #6 cpu_arm_exec (cpu=cpu@entry=0x555556632380) at /home/alex/lsrc/qemu/qemu.git/cpu-exec.c:520 #7 0x00005555557108e8 in tcg_cpu_exec (cpu=0x555556632380) at /home/alex/lsrc/qemu/qemu.git/cpus.c:1486 #8 tcg_exec_all (cpu=0x555556632380) at /home/alex/lsrc/qemu/qemu.git/cpus.c:1515 #9 qemu_tcg_cpu_thread_fn (arg=0x555556632380) at /home/alex/lsrc/qemu/qemu.git/cpus.c:1187 #10 0x00007ffff0e80182 in start_thread (arg=0x7fff8d9d0700) at pthread_create.c:312 #11 0x00007ffff0bad47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 A debugging session is active. Needless to say anything messing with structures used by the other threads needs to take great care or doom will occur ;-) I'll look at fixing them up in my tree while Fred finishes his re-base. -- Alex Bennée