Bug#909417: [Pkg-libvirt-maintainers] Bug#909417: gtk-vnc: FTBFS randomly (vncconnectiontest fails with "assertion failed")
Hi, On Thu, Mar 21, 2019 at 12:33:58AM +0100, Steinar H. Gunderson wrote: > On Sat, Mar 09, 2019 at 07:26:52PM +0100, Santiago Vila wrote: > > Whoever wants to reproduce this (and possibly debug it), *please* > > contact me privately and I will gladly provide ssh access to a machine > > where it happens very often. > > I've looked briefly into this. > > First, to reproduce this reliably, simply restrict it to one core > (taskset -c 0 /build/gtk3/src/vncconnectiontest). The reason will be fairly > clear from the text below. > > As far as I can see, this test simulates a broken VNC server (to test the > client's robustness). It says it's got a 100x100 true color display, but then > goes on and starts sending a color map. > > As soon as the client receives the information about the color map, > it realizes something is wrong, and a race starts. Now the client wants to > close the connection at the same time as the fake server wants to keep > sending the cmap data. If you've got two cores, the server just keeps on > sending data happily; by the time it's noticed that the client is gone, > the test has passed and all is fine. But if you've only got one, then the > first byte of the cmap causes a context switch over to the client, which then > gets ample time to read the data and close the socket before the server gets > to send the next byte. The server thus gets EPIPE, and test_send_u16() > breaks. > > I believe the right fix is to send every byte after the first “set color map” > byte using a non-asserting send. When we've done something invalid, we'd > better be ready for sending data to fail. > > Please try the attached diff; it fixes the problem for me. I can NMU if the > maintainers want. I wouldn't mind an NMU. -- Guido > > /* Steinar */ > -- > Homepage: https://www.sesse.net/ > Index: gtk-vnc-0.9.0/src/vncconnectiontest.c > === > --- gtk-vnc-0.9.0.orig/src/vncconnectiontest.c > +++ gtk-vnc-0.9.0/src/vncconnectiontest.c > @@ -56,12 +56,23 @@ static void test_send_u8(GOutputStream * > g_assert(g_output_stream_write_all(os, , 1, NULL, NULL, NULL)); > } > > +static void send_u8(GOutputStream *os, guint8 v) > +{ > +g_output_stream_write_all(os, , 1, NULL, NULL, NULL); > +} > + > static void test_send_u16(GOutputStream *os, guint16 v) > { > v = GUINT16_TO_BE(v); > g_assert(g_output_stream_write_all(os, , 2, NULL, NULL, NULL)); > } > > +static void send_u16(GOutputStream *os, guint16 v) > +{ > +v = GUINT16_TO_BE(v); > +g_output_stream_write_all(os, , 2, NULL, NULL, NULL); > +} > + > static void test_send_u32(GOutputStream *os, guint32 v) > { > v = GUINT32_TO_BE(v); > @@ -429,18 +440,18 @@ static void test_unexpected_cmap_server( > test_recv_u16(is, 100); > test_recv_u16(is, 100); > > -/* set color map */ > +/* set color map -- after this, the client may close the connection at > any time */ > test_send_u8(os, 1); > /* pad */ > -test_send_u8(os, 0); > +send_u8(os, 0); > /* first color, ncolors */ > -test_send_u16(os, 0); > -test_send_u16(os, 1); > +send_u16(os, 0); > +send_u16(os, 1); > > /* r,g,b */ > -test_send_u16(os, 128); > -test_send_u16(os, 128); > -test_send_u16(os, 128); > +send_u16(os, 128); > +send_u16(os, 128); > +send_u16(os, 128); > } > > > @@ -505,11 +516,13 @@ static void test_overflow_cmap_server(GI > test_send_u16(os, 65535); > test_send_u16(os, 2); > > +/* after this, the client may close the connection at any time */ > + > /* r,g,b */ > for (int i = 0 ; i < 2; i++) { > -test_send_u16(os, i); > -test_send_u16(os, i); > -test_send_u16(os, i); > +send_u16(os, i); > +send_u16(os, i); > +send_u16(os, i); > } > } > > ___ > Pkg-libvirt-maintainers mailing list > pkg-libvirt-maintain...@alioth-lists.debian.net > https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/pkg-libvirt-maintainers
Bug#909417: gtk-vnc: FTBFS randomly (vncconnectiontest fails with "assertion failed")
On Sat, Mar 09, 2019 at 07:26:52PM +0100, Santiago Vila wrote: > Whoever wants to reproduce this (and possibly debug it), *please* > contact me privately and I will gladly provide ssh access to a machine > where it happens very often. I've looked briefly into this. First, to reproduce this reliably, simply restrict it to one core (taskset -c 0 /build/gtk3/src/vncconnectiontest). The reason will be fairly clear from the text below. As far as I can see, this test simulates a broken VNC server (to test the client's robustness). It says it's got a 100x100 true color display, but then goes on and starts sending a color map. As soon as the client receives the information about the color map, it realizes something is wrong, and a race starts. Now the client wants to close the connection at the same time as the fake server wants to keep sending the cmap data. If you've got two cores, the server just keeps on sending data happily; by the time it's noticed that the client is gone, the test has passed and all is fine. But if you've only got one, then the first byte of the cmap causes a context switch over to the client, which then gets ample time to read the data and close the socket before the server gets to send the next byte. The server thus gets EPIPE, and test_send_u16() breaks. I believe the right fix is to send every byte after the first “set color map” byte using a non-asserting send. When we've done something invalid, we'd better be ready for sending data to fail. Please try the attached diff; it fixes the problem for me. I can NMU if the maintainers want. /* Steinar */ -- Homepage: https://www.sesse.net/ Index: gtk-vnc-0.9.0/src/vncconnectiontest.c === --- gtk-vnc-0.9.0.orig/src/vncconnectiontest.c +++ gtk-vnc-0.9.0/src/vncconnectiontest.c @@ -56,12 +56,23 @@ static void test_send_u8(GOutputStream * g_assert(g_output_stream_write_all(os, , 1, NULL, NULL, NULL)); } +static void send_u8(GOutputStream *os, guint8 v) +{ +g_output_stream_write_all(os, , 1, NULL, NULL, NULL); +} + static void test_send_u16(GOutputStream *os, guint16 v) { v = GUINT16_TO_BE(v); g_assert(g_output_stream_write_all(os, , 2, NULL, NULL, NULL)); } +static void send_u16(GOutputStream *os, guint16 v) +{ +v = GUINT16_TO_BE(v); +g_output_stream_write_all(os, , 2, NULL, NULL, NULL); +} + static void test_send_u32(GOutputStream *os, guint32 v) { v = GUINT32_TO_BE(v); @@ -429,18 +440,18 @@ static void test_unexpected_cmap_server( test_recv_u16(is, 100); test_recv_u16(is, 100); -/* set color map */ +/* set color map -- after this, the client may close the connection at any time */ test_send_u8(os, 1); /* pad */ -test_send_u8(os, 0); +send_u8(os, 0); /* first color, ncolors */ -test_send_u16(os, 0); -test_send_u16(os, 1); +send_u16(os, 0); +send_u16(os, 1); /* r,g,b */ -test_send_u16(os, 128); -test_send_u16(os, 128); -test_send_u16(os, 128); +send_u16(os, 128); +send_u16(os, 128); +send_u16(os, 128); } @@ -505,11 +516,13 @@ static void test_overflow_cmap_server(GI test_send_u16(os, 65535); test_send_u16(os, 2); +/* after this, the client may close the connection at any time */ + /* r,g,b */ for (int i = 0 ; i < 2; i++) { -test_send_u16(os, i); -test_send_u16(os, i); -test_send_u16(os, i); +send_u16(os, i); +send_u16(os, i); +send_u16(os, i); } }
Bug#909417: gtk-vnc: FTBFS randomly (vncconnectiontest fails with "assertion failed")
On Sat, Mar 09, 2019 at 07:21:12PM +0100, Paul Gevers wrote: > Hi Chris, > > On 09-03-2019 16:01, Chris Lamb wrote: > >>> I'm reporting this as serious because it happens on the buildds: > >>> > >>> > >> https://buildd.debian.org/status/fetch.php?pkg=gtk-vnc=amd64=0.6.0-3=1486745781=0 > > > > Curiously, I can't reproduce this (at the Cambridge BSP). > > The original report being about being random, how often did you try the > build? Whoever wants to reproduce this (and possibly debug it), *please* contact me privately and I will gladly provide ssh access to a machine where it happens very often. Thanks.
Bug#909417: gtk-vnc: FTBFS randomly (vncconnectiontest fails with "assertion failed")
Hi Paul, > > Curiously, I can't reproduce this (at the Cambridge BSP). > > The original report being about being random, how often did you try the > build? Four or five times? Not exhaustive, but... Regards, -- ,''`. : :' : Chris Lamb `. `'` la...@debian.org chris-lamb.co.uk `-
Bug#909417: gtk-vnc: FTBFS randomly (vncconnectiontest fails with "assertion failed")
Hi Chris, On 09-03-2019 16:01, Chris Lamb wrote: >>> I'm reporting this as serious because it happens on the buildds: >>> >>> >> https://buildd.debian.org/status/fetch.php?pkg=gtk-vnc=amd64=0.6.0-3=1486745781=0 > > Curiously, I can't reproduce this (at the Cambridge BSP). The original report being about being random, how often did you try the build? Paul signature.asc Description: OpenPGP digital signature
Bug#909417: gtk-vnc: FTBFS randomly (vncconnectiontest fails with "assertion failed")
Hi Paul et al., > > I'm reporting this as serious because it happens on the buildds: > > > > > https://buildd.debian.org/status/fetch.php?pkg=gtk-vnc=amd64=0.6.0-3=1486745781=0 Curiously, I can't reproduce this (at the Cambridge BSP). Regards, -- ,''`. : :' : Chris Lamb `. `'` la...@debian.org chris-lamb.co.uk `-
Bug#909417: gtk-vnc: FTBFS randomly (vncconnectiontest fails with "assertion failed")
Hi Libvirt maintainers, On Sun, 23 Sep 2018 10:35:52 + Santiago Vila wrote: > I'm reporting this as serious because it happens on the buildds: > > https://buildd.debian.org/status/fetch.php?pkg=gtk-vnc=amd64=0.6.0-3=1486745781=0 Any progress on this? We are nearing the full freeze. Paul signature.asc Description: OpenPGP digital signature
Bug#909417: gtk-vnc: FTBFS randomly (vncconnectiontest fails with "assertion failed")
Package: src:gtk-vnc Version: 0.6.0-3 Severity: serious Tags: ftbfs Dear maintainer: I tried to build this package in buster but it failed: [...] debian/rules build-arch dh build-arch --with autoreconf dh_update_autotools_config -a dh_autoreconf -a libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, 'build-aux'. libtoolize: copying file 'build-aux/ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'. libtoolize: copying file 'm4/libtool.m4' libtoolize: copying file 'm4/ltoptions.m4' libtoolize: copying file 'm4/ltsugar.m4' libtoolize: copying file 'm4/ltversion.m4' libtoolize: copying file 'm4/lt~obsolete.m4' configure.ac:14: installing 'build-aux/compile' configure.ac:11: installing 'build-aux/missing' [... snipped ...] # FAIL: 1 # XPASS: 0 # ERROR: 0 .. contents:: :depth: 2 FAIL: vncconnectiontest === /conn/validation/rre: OK /conn/validation/copyrect: OK /conn/validation/hextile: OK /conn/validation/unexpectedcmap: ** ERROR:../../../src/vncconnectiontest.c:62:test_send_u16: assertion failed: (g_output_stream_write_all(os, , 2, NULL, NULL, NULL)) FAIL vncconnectiontest (exit status: 134) Testsuite summary for gtk-vnc 0.9.0 # TOTAL: 1 # PASS: 0 # SKIP: 0 # XFAIL: 0 # FAIL: 1 # XPASS: 0 # ERROR: 0 See src/test-suite.log Please report to https://gitlab.gnome.org/GNOME/gtk-vnc/issues make[6]: *** [Makefile:1697: test-suite.log] Error 1 make[6]: Leaving directory '/<>/build/gtk2/src' make[5]: *** [Makefile:1805: check-TESTS] Error 2 make[5]: Leaving directory '/<>/build/gtk2/src' make[4]: *** [Makefile:1878: check-am] Error 2 make[4]: Leaving directory '/<>/build/gtk2/src' make[3]: *** [Makefile:1880: check] Error 2 make[3]: Leaving directory '/<>/build/gtk2/src' make[2]: *** [Makefile:624: check-recursive] Error 1 make[2]: Leaving directory '/<>/build/gtk2' dh_auto_test: cd build/gtk2 && make -j1 check VERBOSE=1 returned exit code 2 make[1]: *** [debian/rules:33: override_dh_auto_test] Error 2 make[1]: Leaving directory '/<>' make: *** [debian/rules:6: build-arch] Error 2 dpkg-buildpackage: error: debian/rules build-arch subprocess returned exit status 2 I'm reporting this as serious because it happens on the buildds: https://buildd.debian.org/status/fetch.php?pkg=gtk-vnc=amd64=0.6.0-3=1486745781=0 The above error was from stretch, but it's the same error I get on buster and sid, which means the bug has not been fixed yet. If you could not reproduce this on a single-CPU machine using sbuild (as I do), where this failure seems to be particularly easy to trigger, please say so and I will gladly offer ssh access to a machine where this happens (contact me privately for details). Thanks.