On Wed, Jun 21, 2023 at 09:38:08PM +0200, Juan Quintela wrote: > Peter Xu <pet...@redhat.com> wrote: > > On Fri, Jun 09, 2023 at 12:49:13AM +0200, Juan Quintela wrote: > >> It failed on aarch64 tcg, lets see if that is still the case. > >> > >> Signed-off-by: Juan Quintela <quint...@redhat.com> > > > > According to the history: > > > > https://lore.kernel.org/all/20190305180635.GA3803@work-vm/ > > > > It's never enabled, and not sure whether Yury followed it up. Juan: have > > you tried it out on aarch64 before enabling it again? I assume we rely on > > the previous patch but that doesn't even sound like aarch64 specific. I > > worry it'll just keep failing on aarch64. > > Hi > > I am resending this series. > > I hard tested this time. x86_64 host. > Two build directories: > - x86_64 (I just build qemu-system-x86_64, kvm) > - aarch64 (I just build qemu-system-aarch64, tcg) > > Everything is run as: > > while true; do $command || break; done > > And run this: > - x86_64: > * make check (nit: you can't run two make checks on the same > directory) > * 4 ./test/qtest/migration-test > * 2 ./test/qtest/migration-test -p ./tests/qtest/migration-test -p > /x86_64/migration/multifd/tcp/plain/cancel > * 2 ./test/qtest/migration-test -p ./tests/qtest/migration-test -p > /x86_64/migration/ignore_shared > > - aarch64: > The same with s/x86_64/aarch64/ > > And left it running for 6 hours. No errors. > Machine has enough RAM for running this (128GB) and 18 cores (intel > i9900K). > Load of the machine while running this tests is around 50 (I really hope > that our CI hosts have less load). > > A run master with the same configuration. In less than 10 minutes I get > the dreaded: > > # starting QEMU: exec ./qemu-system-aarch64 -qtest > unix:/tmp/qtest-3264370.sock -qtest-log /dev/null -chardev > socket,path=/tmp/qtest-3264370.qmp,id=char0 -mon chardev=char0,mode=control > -display none -accel kvm -accel tcg -machine virt,gic-version=max -name > target,debug-threads=on -m 150M -serial > file:/tmp/migration-test-1A1461/dest_serial -incoming defer -cpu max -kernel > /tmp/migration-test-1A1461/bootsect -accel qtest > Broken pipe > ../../../../../mnt/code/qemu/multifd/tests/qtest/libqtest.c:195: kill_qemu() > detected QEMU death from signal 6 (Aborted) (core dumped) > Aborted (core dumped) > $ > > On multifd+cancel. > > I have no been able to ever get ignore_shared to fail on my machine. > But I didn't tested aarch64 TCG in the past so hard, and in x86_64 it > has always worked for me.
Thanks a lot, Juan. Do you mean master is broken with QEMU_TEST_FLAKY_TESTS=1? And after the whole series applied we cannot trigger issue in the few hours test even with it? Shall we wait for another 1-2 days to see whether Yury would comment (before you repost)? Otherwise I agree if it survives your few-hours test we should give it a try - at least according to Dave's comment before it was failing easily, but it is not now on the test bed. Maybe it's still just hidden, but in that case I also agree enabling it in the repo is the simplest way to reproduce the failure again, if we still ever want to enable it one day.. -- Peter Xu