Here is a log from the CI, but I don't think it has much information: https://gitlab.com/qemu-project/qemu/-/jobs/5020899550
Is it possible to detect the crash? Timeouts are hard to diagnose, so it would be better for the test to detect a terminated child process and print an error. Stefan On Tue, 12 Sept 2023 at 10:08, Laurent Vivier <lviv...@redhat.com> wrote: > > On 9/12/23 15:42, Daniel P. Berrangé wrote: > > On Tue, Sep 12, 2023 at 09:33:10AM -0400, Stefan Hajnoczi wrote: > >> The test still fails intermittently with a 60 second timeout in the > >> GitLab CI environment. Raise the timeout to 120 seconds. > >> > >> 576/839 ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: > >> assertion failed (resp == expect): ("st0: index=0,type=stream,connection > >> error\r\n" == "st0: > >> index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") ERROR > >> 576/839 qemu:qtest+qtest-sh4 / qtest-sh4/netdev-socket > >> ERROR 62.85s killed by signal 6 SIGABRT > >> >>> MALLOC_PERTURB_=249 QTEST_QEMU_BINARY=./qemu-system-sh4 > >> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon > >> G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh > >> QTEST_QEMU_IMG=./qemu-img > >> /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/netdev-socket > >> --tap -k > >> ――――――――――――――――――――――――――――――――――――― ✀ > >> ――――――――――――――――――――――――――――――――――――― > >> stderr: > >> ** > >> ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion > >> failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" > >> == "st0: > >> index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") > >> (test program exited with status code -6) > >> > >> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1881 > >> Fixes: 417296c8d858 ("tests/qtest/netdev-socket: Raise connection timeout > >> to 60 seconds") > >> Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > > > > That bumped the timeout from 5 seconds to 60 seconds to > > cope with intermittent failures, which was a x12 > > increases. I'm concerned that it would still be failing > > in largely the same way after that, and possibly we are > > instead hitting a race condition causing setup to fail, > > which masquerades as a timeout. > > > >> --- > >> tests/qtest/netdev-socket.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c > >> index 8eed54801f..b2501d72a1 100644 > >> --- a/tests/qtest/netdev-socket.c > >> +++ b/tests/qtest/netdev-socket.c > >> @@ -16,7 +16,7 @@ > >> #include "qapi/qobject-input-visitor.h" > >> #include "qapi/qapi-visit-sockets.h" > >> > >> -#define CONNECTION_TIMEOUT 60 > >> +#define CONNECTION_TIMEOUT 120 > >> > >> #define EXPECT_STATE(q, e, t) \ > >> do { \ > > > > I'll add > > > > Reviewed-by: Daniel P. Berrangé <berra...@redhat.com> > > > > but with the caveat that i'm only 50/50 on whether this is actually > > the right fix. Doesn't hurt to try it, but if 120 seconds still shows > > failures I'd say we're hitting a functional race not a timeout. > > It can also happen if the first QEMU (server) crashes. Do we have some traces > from this side? > > Reviewed-by: Laurent Vivier <lviv...@redhat.com> > > Thanks, > Laurent > >