Hi Shashi, Thank you very much for letting me know. I changed virt machine to the version 6.1 and the error disappeared. But the guest OS is experiencing severe delays while booting and starting. The delays take minutes mostly here:
#0 0x00007f1d0932554d in __lll_lock_wait () at /lib64/libpthread.so.0 #1 0x00007f1d09320e9b in _L_lock_883 () at /lib64/libpthread.so.0 #2 0x00007f1d09320d68 in pthread_mutex_lock () at /lib64/libpthread.so.0 #3 0x0000560bf51637b3 in qemu_mutex_lock_impl (mutex=0x560bf5e05820 <qemu_global_mutex>, file=0x560bf56db84b "../util/main-loop.c", line=252) at ../util/qemu-thread-posix.c:79 #4 0x0000560bf4d65403 in qemu_mutex_lock_iothread_impl (file=0x560bf56db84b "../util/main-loop.c", line=252) at ../softmmu/cpus.c:491 #5 0x0000560bf516faa5 in os_host_main_loop_wait (timeout=2367975) at ../util/main-loop.c:252 #6 0x0000560bf516fbb0 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:530 #7 0x0000560bf4ddc186 in qemu_main_loop () at ../softmmu/runstate.c:725 #8 0x0000560bf473ae42 in main (argc=63, argv=0x7ffc5920eba8, envp=0x7ffc5920eda8) at ../softmmu/main.c:50 and here: #0 0x00007f1d0903cd8f in ppoll () at /lib64/libc.so.6 #1 0x0000560bf512e2d0 in qemu_poll_ns (fds=0x560bf70f12b0, nfds=5, timeout=350259000000) at ../util/qemu-timer.c:348 #2 0x0000560bf516fa8c in os_host_main_loop_wait (timeout=350259000000) at ../util/main-loop.c:249 #3 0x0000560bf516fbb0 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:530 #4 0x0000560bf4ddc186 in qemu_main_loop () at ../softmmu/runstate.c:725 #5 0x0000560bf473ae42 in main (argc=63, argv=0x7ffc5920eba8, envp=0x7ffc5920eda8) at ../softmmu/main.c:50 Eventually, the guest hangs at the second back trace above. Best regards, Andrey On 5/13/21 7:45 PM, Shashi Mallela wrote: > Hi Andrey, > > To clarify, the patch series > > https://lists.gnu.org/archive/html/qemu-arm/2021-04/msg00944.html > "GICv3 LPI and ITS feature implementation" > > is applicable for virt machine 6.1 onwards,i.e ITS TCG functionality is > not available for version 6.0 that is being tried > here. > > Thanks > Shashi > > On May 13 2021, at 12:35 pm, Andrey Shinkevich > <andrey.shinkev...@huawei.com> wrote: > > Dear colleagues, > > Thank you all very much for your responses. Let me reply with one > message. > > I configured QEMU for AARCH64 guest: > $ ./configure --target-list=aarch64-softmmu > > When I start QEMU with GICv3 on an x86 host: > qemu-system-aarch64 -machine virt-6.0,accel=tcg,gic-version=3 > > QEMU reports this error from hw/pci/msix.c: > error_setg(errp, "MSI-X is not supported by interrupt controller"); > > Probably, the variable 'msi_nonbroken' would be initialized in > hw/intc/arm_gicv3_its_common.c: > gicv3_its_init_mmio(..) > > I guess that it works with KVM acceleration only rather than with TCG. > > The error persists after applying the series: > https://lists.gnu.org/archive/html/qemu-arm/2021-04/msg00944.html > "GICv3 LPI and ITS feature implementation" > (special thanks for referring me to that) > > Please, make me clear and advise ideas how that error can be fixed? > Should the MSI-X support be implemented with GICv3 extra? > > When successful, I would like to test QEMU for a maximum number of cores > to get the best MTTCG performance. > Probably, we will get just some percentage of performance enhancement > with the BQL series applied, won't we? I will test it as well. > > Best regards, > Andrey Shinkevich > > > On 5/12/21 6:43 PM, Alex Bennée wrote: > > > > Andrey Shinkevich <andrey.shinkev...@huawei.com> writes: > > > >> Dear colleagues, > >> > >> I am looking for ways to accelerate the MTTCG for ARM guest on > x86-64 host. > >> The maximum number of CPUs for MTTCG that uses GICv2 is limited > by 8: > >> > >> include/hw/intc/arm_gic_common.h:#define GIC_NCPU 8 > >> > >> The version 3 of the Generic Interrupt Controller (GICv3) is not > >> supported in QEMU for some reason unknown to me. It would allow to > >> increase the limit of CPUs and accelerate the MTTCG performance on a > >> multiple core hypervisor. > > > > It is supported, you just need to select it. > > > >> I have got an idea to implement the Interrupt Translation > Service (ITS) > >> for using by MTTCG for ARM architecture. > > > > There is some work to support ITS under TCG already posted: > > > > Subject: [PATCH v3 0/8] GICv3 LPI and ITS feature implementation > > Date: Thu, 29 Apr 2021 19:41:53 -0400 > > Message-Id: <20210429234201.125565-1-shashi.mall...@linaro.org> > > > > please do review and test. > > > >> Do you find that idea useful and feasible? > >> If yes, how much time do you estimate for such a project to > complete by > >> one developer? > >> If no, what are reasons for not implementing GICv3 for MTTCG in > QEMU? > > > > As far as MTTCG performance is concerned there is a degree of > > diminishing returns to be expected as the synchronisation cost > between > > threads will eventually outweigh the gains of additional threads. > > > > There are a number of parts that could improve this performance. The > > first would be picking up the BQL reduction series from your > FutureWei > > colleges who worked on the problem when they were Linaro assignees: > > > > Subject: [PATCH v2 0/7] accel/tcg: remove implied BQL from > cpu_handle_interrupt/exception path > > Date: Wed, 19 Aug 2020 14:28:49 -0400 > > Message-Id: <20200819182856.4893-1-robert.fo...@linaro.org> > > > > There was also a longer series moving towards per-CPU locks: > > > > Subject: [PATCH v10 00/73] per-CPU locks > > Date: Wed, 17 Jun 2020 17:01:18 -0400 > > Message-Id: <20200617210231.4393-1-robert.fo...@linaro.org> > > > > I believe the initial measurements showed that the BQL cost > started to > > edge up with GIC interactions. We did discuss approaches for this > and I > > think one idea was use non-BQL locking for the GIC. You would need to > > revert: > > > > Subject: [PATCH-for-5.2] exec: Remove > MemoryRegion::global_locking field > > Date: Thu, 6 Aug 2020 17:07:26 +0200 > > Message-Id: <20200806150726.962-1-phi...@redhat.com> > > > > and then implement a more fine tuned locking in the GIC emulation > > itself. However I think the BQL and per-CPU locks are lower hanging > > fruit to tackle first. > > > >> > >> Best regards, > >> Andrey Shinkevich > > > > > > Sent from Mailspring