On Wed, Sep 7, 2022 at 10:40 AM Peter Maydell <peter.mayd...@linaro.org>
wrote:

> On Wed, 7 Sept 2022 at 16:39, Patrick Venture <vent...@google.com> wrote:
> >
> > # Start of nvme tests
> > # Start of pci-device tests
> > # Start of pci-device-tests tests
> > # starting QEMU: exec ./qemu-system-aarch64 -qtest
> unix:/tmp/qtest-1431.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-1431.qmp,id=char0 -mon chardev=char0,mode=control
> -display none -M virt, -cpu max -drive
> id=drv0,if=none,file=null-co://,file.read-zeroes=on,format=raw -object
> memory-backend-ram,id=pmr0,share=on,size=8 -device
> nvme,addr=04.0,drive=drv0,serial=foo -accel qtest
> >
> > #
> ERROR:../../src/qemu/tests/qtest/libqtest.c:338:qtest_init_without_qmp_handshake:
> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
> > stderr:
> > double free or corruption (out)
> > socket_accept failed: Resource temporarily unavailable
> > **
> >
> ERROR:../../src/qemu/tests/qtest/libqtest.c:338:qtest_init_without_qmp_handshake:
> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
> > ../../src/qemu/tests/qtest/libqtest.c:165: kill_qemu() detected QEMU
> death from signal 6 (Aborted) (core dumped)
> >
> > I'm not seeing this reliably, and we haven't done a lot of digging yet,
> such as enabling sanitizers, so I'll reply back to this thread with details
> as I have them.
> >
> > Has anyone seen this before or something like it?
>
> Have a look in the source at what exactly the assertion
> failure in libqtest.c is checking for -- IIRC it's a pretty
> basic "did we open a socket fd" one. I think sometimes I
> used to see something like this if there's an old stale socket
> lying around in the test directory and the randomly generated
> socket filename happens to clash with it.
>

Thanks for the debugging tip! I can't reproduce it at this point. I saw it
2-3 times, and now not at all.  So more than likely it's exactly what
you're describing.


>
> Everything after that is probably follow-on errors from the
> tests not being terribly clean about error handling.
>
> Are you running 'make check' with a -j option for parallel?
> (This is supposed to work, and it's the standard way I run
> 'make check', so if it's flaky we need to fix it, but it
> would be interesting to know if the issue repros at -j1.)
>

Since it's not reproducing reliably -- and I haven't actually seen it since
the first few instances (and it was unrelated to those patches in flight),
I'll have to sit on further debug until we reproduce it and then I can let
you know, but this seems to be flaky at the point where it's hard to detect.


>
> -- PMM
>

Reply via email to