Bug#512366: eboard hangs up and use 100%CPU on PowerPC
Patrik Fimml un jour écrivit: Okay, I think I found the real culprit. In various places, network.cc limits output to non-control characters by comparing with 0x20. On x86, char is signed and 0xFF will be (-1), thus being treated as control character. On ppc, chars are unsigned AFAIK, and 0xFF will be (255), and passed on to the rest of the code. Ah, I should have tough about it. Yes, I confirm that char are unsigned for PowerPC, because managing signed char takes few more instructions on PowerPC than unsigned one (though the diffirence shouldn't be mesurable in our case). It means that other architectures are also broken and will need to be recompiled (ARM came to my mind). 154 if (buffer.front()>=32) 219 if (c>=0x20) 294 if (c>=0x20) My fix for the time being would be to duplicate behaviour as on x86, using signed chars everywhere (as I suspect that other bugs might arise otherwise). A patch to the source package is attached, would you please try if that fixes the problem? I think you should simply cast "c" to a signed byte just to make more obvious how silly this hack is :o) More seriously, It seems to works now, thank you. The only weird thing I noticed is when playing offline against a computer, when the computer sometime played illegal moves (at least it is what eboard claimed before letting me play twice in a row) and sometime the computer being allowed to play 2 moves in a row. If I believe the quality of the code we examined, they are probably just some other unrelated bugs that also affect x86 systems. Also, I saw many very bad mistakes in the French translation, including one case for Bughouse where "partner" was translated as opponent, and one big grammar error in the sub menu + many typo. Since you are going to upload a new version anyway, I think it would be important to take it as an opportunity to improve the translation as well. I have completed the translation of every string in the .po file except for "Helper program not found" that I am still not sure of how I should translate it to French. I'll ask for a review on the debian french translation mailing list, and send to you directly the new .po file (and a copy for the actual French translator for eboard). If you don't want to wait, I'll send you right away what I have, which can't be worst that the past translation. I also spoted an error in the Japanese version. Simon Valiquette -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#512366: eboard hangs up and use 100%CPU on PowerPC
On Thu, Jan 22, 2009 at 02:16:18PM -0500, Simon Valiquette wrote: > By the way, I noticed that the compiler warned that eboard was linked > with a number of apparently unnecessary libraries like libpango. For > Lenny, it is best to keep it like it is now, but it would be a good idea > to look at it for Lenny+1. Pango is part of GTK+ and automatically pulled in by pkg-config. Okay, I think I found the real culprit. In various places, network.cc limits output to non-control characters by comparing with 0x20. On x86, char is signed and 0xFF will be (-1), thus being treated as control character. On ppc, chars are unsigned AFAIK, and 0xFF will be (255), and passed on to the rest of the code. 154 if (buffer.front()>=32) 219 if (c>=0x20) 294 if (c>=0x20) My fix for the time being would be to duplicate behaviour as on x86, using signed chars everywhere (as I suspect that other bugs might arise otherwise). A patch to the source package is attached, would you please try if that fixes the problem? Kind regards, Patrik --- eboard-1.1.1/debian/rules~ 2009-01-22 21:14:25.0 +0100 +++ eboard-1.1.1/debian/rules 2009-01-22 21:15:05.0 +0100 @@ -8,7 +8,8 @@ configure: configure-stamp configure-stamp: patch-stamp dh_testdir - ./configure --prefix=/usr --data-prefix=/usr/share/games --man-prefix=/usr/share/man + ./configure --prefix=/usr --data-prefix=/usr/share/games \ + --man-prefix=/usr/share/man --extra-flags=-fsigned-char touch configure-stamp signature.asc Description: Digital signature
Bug#512366: eboard hangs up and use 100%CPU on PowerPC
Patrik Fimml un jour écrivit: Okay, I found out that it was looping inside NText::formatLine exactly between the following lines in the file ntext.cc: 320 while(k-j > 0) { 321 fit = false; 322 323 // try full-fit for for unwrapped of last chunk of wrapping 324 325 if (j==0 && sl->Width >= 0) { 326 w = sl->Width; 327 } else { 328 if (!g_utf8_validate(tp+j,k-j,NULL)) continue; Obviously, "g_utf8_validate()" always returns false so the execution flow always move back to the start of the "while" loop. Here the backtrace in case you would still want to see it. On Etch, there was also 2 stranges (but innocent looking) caracters appearing just after login. I can't remember for sure if it was "ÿû" exactly, but I do remember that it was some accented letters. Do you know where they appeared? After your login name, or after the "password:" prompt, or anywhere inbetween? They didn't appeared until I pressed enter for the password. They appeared on a separate line, probably just after the password line, but I am not 100% sure for the exact line. My guess is that the characters used to make a terminal not display its input (i.e. if you were entering your password over a telnet FICS session) are upsetting eboard. That would makes sense, since on a x86 computer the caracters are replaced by bigs round dots in the gui and that there is no output in the text console. Actually, the guide I linked has got it wrong, you need to set DEB_BUILD_OPTS="nostrip noopt", separated by a space, not a comma. Please re-compile eboard without optimizations for further debugging, and get a detailed backtrace (bt full). Since my default shell is tcsh, the syntax for me is: setenv DEB_BUILD_OPTIONS 'nostrip noopt' I am a little surprised that it did not strip the symboles, but just ignored the rest of the line without complaining (or I missed it). Please dump the raw byte contents of the trouble-causing buffer. Assuming you break in g_utf8_validate again, you could do: (gdb) p /x *...@10 That gives me this, with 0xff and 0xfc being ÿ and ü: {0xff, 0xfc, 0x0, 0xd8, 0x10, 0x25, 0x99, 0x10, 0x10, 0x39} I also give you a little more. I guess that sl->Width=-1 means that the length is unknown? (gdb) p /x *sl $13 = {_vptr.NLine = 0x10120b18, Text = 0x10390bd0, NBytes = 0x2, Color = 0xff, Width = 0x, Timestamp = 0x4978ab46} (gdb) p /x *fl $15 = { = {_vptr.NLine = 0xbfd23950, Text = 0x100a4064, NBytes = 0x101da800, Color = 0x28, Width = 0xbfd23980, Timestamp = 0x101daa18}, Src = 0xfb3b810, Off = 0x0, X = 0xbfd23970, Y = 0x100a4124, H = 0x101da808, valid = 0xbf, stamp = {0xd2, 0x39, 0x80, 0x10, 0x1c, 0xc, 0x28, 0x10, 0x1d, 0xa8, 0x0, 0xf}} And here, I give you the full backtrace before entering the function. 328 if (!g_utf8_validate(tp+j,k-j,NULL)) continue; (gdb) bt full #0 NText::formatLine (this=0x101da800, i=40) at ntext.cc:328 j = 0 k = 2 l = -1076741840 w = -1076741856 color = 16777215 fit = false sl = (NLine *) 0x10395c68 fl = (FLine *) 0xbfd23930 tp = 0x10390bd0 "ÿü" elw = 705 #1 0x100a3678 in NText::append (this=0x101da800, text=0xbfd23f34 "ÿü", len=2, color=16777215) at ntext.cc:262 i = -1076741760 nl = (NLine *) 0x10395c68 p = 0x0 #2 0x100f60d4 in Text::append (this=0x101da800, msg=0xbfd23f34 "ÿü", color=16777215, imp=IM_NORMAL) at text.cc:132 No locals. #3 0x100f3728 in TextSet::append (this=0x101d9ca8, msg=0xbfd23f34 "ÿü", color=16777215, imp=IM_NORMAL) at text.cc:181 ti = {_M_node = 0x101df748} #4 0x100c4f34 in FicsProtocol::doOutput (this=0x10259880, msg=0xbfd23f34 "ÿü", channel=-1, personal=false, msgcolor=16777215) at proto_fics.cc:824 No locals. #5 0x100c767c in FicsProtocol::parser2 (this=0x10259880, T=0xbfd23f34 "ÿü") at proto_fics.cc:346 No locals. #6 0x100c79fc in FicsProtocol::parser1 (this=0x10259880, T=0xbfd23f34 "ÿü") at proto_fics.cc:312 pstring = "¿Ò>0\000\000\000\002¿Ò\020\035º\220¿Ò;0\017Ã\001\204\000\001f\000\000\000\000\000\017Ã\001°¿Ò;x\017úÐL\000\000\000\000¿Ò;P\017¾&L\000\000\020\000\020\0301°¿Ò;x\017Ã\001°\017úë \020\0301°¿Ò;p\017Ã\021\200\0206\020P\000\000V\000\020\030é\200\020\035¹ð\017úë \020\0301°¿Ò; \017Ã2\220\020\0301°\020\035¹ð¿Ò; ¿Ò>0\017\214\200\\¿Ò>0¿Ò; ¿Ò>0\017û&p\017Ã1À"... Here, I wondered if the caracters I previously saw in Etch were not "úë" instead of "ÿü", but maybe they always have been "ÿü". #7 0x100c7a54 in FicsProtocol::receiveString (this=0x10259880, netstring=0xbfd23f34 "ÿü") at proto_fics.cc:247 No locals. #8 0x10088020 in MainWindow::readAvailable (this=0x10187470, handle=7) at mainwindow.cc:1687 net = (class NetConnection *) 0x1035e718 line = "ÿü", '\0' gotinput = 1 loopc = 0 #9 0x1008ee30 in netconn_read_notify (data=0x1035e8a0, source=7, cond=GDK_INPUT_READ
Bug#512366: eboard hangs up and use 100%CPU on PowerPC
>>> If you need any more information or want me to try a patch, just ask me. >> >> Please try to get a backtrace with debugging symbols after the freeze >> occurs and you hit CTRL-C. > > Okay, I found out that it was looping inside NText::formatLine exactly > between the following lines in the file ntext.cc: > > 320 while(k-j > 0) { > 321 fit = false; > 322 > 323 // try full-fit for for unwrapped of last chunk of wrapping > 324 > 325 if (j==0 && sl->Width >= 0) { > 326 w = sl->Width; > 327 } else { > 328 if (!g_utf8_validate(tp+j,k-j,NULL)) continue; > > Obviously, "g_utf8_validate()" always returns false so the execution > flow always move back to the start of the "while" loop. > > Is it just me that is very tired, or will it always test exactly the > same string? Yes, it will. > The only sane explanation that came to my mind is that on x86, unless > there is something wrong, the data are almost always valid Unicode on the > first try (which seems reasonnable). > > It looks like a cut&paste error because just after, there is a "for" > loop with almost exaclty the same code, except that the code makes more > sense and don't always test the same data. > > It would also means that on big endian systems, there is maybe another > bug somewhere that made this bug show up. Yeah. The code is weird enough, but we shouldn't even get that far. > Here the backtrace in case you would still want to see it. On Etch, > there was also 2 stranges (but innocent looking) caracters appearing just > after login. I can't remember for sure if it was "ÿû" exactly, but I do > remember that it was some accented letters. Do you know where they appeared? After your login name, or after the "password:" prompt, or anywhere inbetween? My guess is that the characters used to make a terminal not display its input (i.e. if you were entering your password over a telnet FICS session) are upsetting eboard. Actually, the guide I linked has got it wrong, you need to set DEB_BUILD_OPTS="nostrip noopt", separated by a space, not a comma. Please re-compile eboard without optimizations for further debugging, and get a detailed backtrace (bt full). Please dump the raw byte contents of the trouble-causing buffer. Assuming you break in g_utf8_validate again, you could do: (gdb) p /x *...@10 Besides, a dump of the network traffic would probably be helpful. With netcat, you could do $ nc -l -p 5000 -c 'nc -o eboard-tcp-log freechess.org 5000' and let eboard connect to localhost:5000 to get a hex dump in eboard-tcp-log. Kind regards, Patrik signature.asc Description: Digital signature
Bug#512366: eboard hangs up and use 100%CPU on PowerPC
Patrik Fimml un jour écrivit: >> More exactly, it start as usual, but when connecting to a chess server, it hangs and consume 100% CPU once the user have entered a user name. [...] If I press 'return here, the screen draw an additionnal '>' on the next line, and the window hangs (it cannot be closed except by pressing CTRL-C twice in a console or by killing it). Are you using a custom login script (~/.eboard/scripts/autofics.pl) that does weird stuff, maybe? If yes, please try without the script. No. Also, before sending the bug report, I uninstalled and purged the package, rename the ~/.eboard/ folder and reinstalled eboard and tested it again just to make sure it was not something like that. If you need any more information or want me to try a patch, just ask me. Please try to get a backtrace with debugging symbols after the freeze occurs and you hit CTRL-C. Okay, I found out that it was looping inside NText::formatLine exactly between the following lines in the file ntext.cc: 320 while(k-j > 0) { 321 fit = false; 322 323 // try full-fit for for unwrapped of last chunk of wrapping 324 325 if (j==0 && sl->Width >= 0) { 326 w = sl->Width; 327 } else { 328 if (!g_utf8_validate(tp+j,k-j,NULL)) continue; Obviously, "g_utf8_validate()" always returns false so the execution flow always move back to the start of the "while" loop. Is it just me that is very tired, or will it always test exactly the same string? The only sane explanation that came to my mind is that on x86, unless there is something wrong, the data are almost always valid Unicode on the first try (which seems reasonnable). It looks like a cut&paste error because just after, there is a "for" loop with almost exaclty the same code, except that the code makes more sense and don't always test the same data. It would also means that on big endian systems, there is maybe another bug somewhere that made this bug show up. Here the backtrace in case you would still want to see it. On Etch, there was also 2 stranges (but innocent looking) caracters appearing just after login. I can't remember for sure if it was "ÿû" exactly, but I do remember that it was some accented letters. (gdb) bt #0 0x0f7ab194 in IA__g_utf8_validate (str=0x102b4048 "ÿû", max_len=2, end=0x0) at /build/buildd/glib2.0-2.16.6/glib/gutf8.c:1754 #1 0x1005a4c4 in NText::formatLine (this=0x1016ca10, i=34) at ntext.cc:328 #2 0x1005c4b0 in NText::append (this=0x1016ca10, text=0xbff95f48 "ÿû", len=2, color=16777215) at ntext.cc:262 #3 0x1008d15c in Text::append (this=0x1016ca10, msg=0xbff95f48 "ÿû", color=16777215, imp=IM_NORMAL) at text.cc:132 #4 0x1008bc64 in TextSet::append (this=, msg=0xbff95f48 "ÿû", color=16777215, imp=IM_NORMAL) at text.cc:181 #5 0x1006c78c in FicsProtocol::doOutput (this=, msg=0xbff95f48 "ÿû", channel=, personal=, msgcolor=) at proto_fics.cc:824 #6 0x10073aa0 in FicsProtocol::parser1 (this=0x1025b3c0, T=0xbff95f48 "ÿû") at proto_fics.cc:312 #7 0x100493b8 in MainWindow::readAvailable (this=, handle=) at mainwindow.cc:1694 #8 0x1004fce8 in netconn_read_notify (data=, source=, cond=) at network.cc:133 #9 0x0faa351c in gdk_io_invoke (source=, condition=, data=) at /build/buildd/gtk+2.0-2.12.11/gdk/gdkevents.c:986 #10 0x0f7b2e68 in g_io_unix_dispatch (source=0x102b0df8, callback=0xfaa34a0 , user_data=0x101c88f8) at /build/buildd/glib2.0-2.16.6/glib/giounix.c:162 #11 0x0f76c72c in IA__g_main_context_dispatch (context=0x1010d330) at /build/buildd/glib2.0-2.16.6/glib/gmain.c:2012 #12 0x0f770ec8 in g_main_context_iterate (context=0x1010d330, block=1, dispatch=1, self=) at /build/buildd/glib2.0-2.16.6/glib/gmain.c:2645 #13 0x0f771604 in IA__g_main_loop_run (loop=0x10213440) at /build/buildd/glib2.0-2.16.6/glib/gmain.c:2853 #14 0x0fce9bc4 in IA__gtk_main () at /build/buildd/gtk+2.0-2.12.11/gtk/gtkmain.c:1163 #15 0x10045d44 in main (argc=1, argv=0xbff96be4) at main.cc:108 Can anybody else reproduce this on PPC? I used to have a Sparc computer. That would actually have been very handy to test my theory about endianness and confirm the bug, but unfortunately I don't have it anymore. I am also a little surprised that nobody reported it before. Simon Valiquette -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#512366: eboard hangs up and use 100%CPU on PowerPC
On Tue, Jan 20, 2009 at 02:19:59AM -0500, Simon Valiquette wrote: > After upgrading from Etch to Lenny, eboard stopped working on PowerPC. > Since it works fine on x86, my guess is that there is maybe an > endianness issue somewhere in the networking code of eboard. > > More exactly, it start as usual, but when connecting to a chess server, > it hangs and consume 100% CPU once the user have entered a user name. > [...] > If I press 'return here, the screen draw an additionnal '>' on the next > line, and the window hangs (it cannot be closed except by pressing CTRL-C > twice in a console or by killing it). Are you using a custom login script (~/.eboard/scripts/autofics.pl) that does weird stuff, maybe? If yes, please try without the script. > If you need any more information or want me to try a patch, just ask me. Please try to get a backtrace with debugging symbols after the freeze occurs and you hit CTRL-C. The wiki has a farily comprehensive guide [1], if you need further assistance, just contact me. Thanks in advance! [1] http://wiki.debian.org/HowToGetABacktrace Can anybody else reproduce this on PPC? Kind regards, Patrik signature.asc Description: Digital signature
Bug#512366: eboard hangs up and use 100%CPU on PowerPC
Package: eboard Version: 1.1.1-2 Severity: grave -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 After upgrading from Etch to Lenny, eboard stopped working on PowerPC. Since it works fine on x86, my guess is that there is maybe an endianness issue somewhere in the networking code of eboard. More exactly, it start as usual, but when connecting to a chess server, it hangs and consume 100% CPU once the user have entered a user name. Here an example of what I get: |Server location: freechess.org Server version : 1.25.17 | | If you are not a registered player, enter guest or a unique ID. | (If your return key does not work, use cntrl-J) | |login: |> guest | |Logging you in as "GuestDJMT"; you may use this name to play unrated games. |(After logging in, do "help register" for more info on how to register.) | |Press return to enter the server as "GuestDJMT": If I press 'return here, the screen draw an additionnal '>' on the next line, and the window hangs (it cannot be closed except by pressing CTRL-C twice in a console or by killing it). Note that using xkill will close the window but the process will still be consuming 100% CPU in the background. I tried to run eboard with the options -log and -debug, but It didn't gave much more information except confirming that it hangs somewhere between when the username is asked and when the password is asked. If you need any more information or want me to try a patch, just ask me. Thank you, Simon Valiquette - -- System Information: Debian Release: 5.0 APT prefers testing APT policy: (500, 'testing') Architecture: powerpc (ppc) Kernel: Linux 2.6.26-1-vserver-powerpc (SMP w/1 CPU core) Locale: LANG=fr_CA, LC_CTYPE=fr_CA (charmap=ISO-8859-1) Shell: /bin/sh linked to /bin/dash Versions of packages eboard depends on: ii libatk1.0-0 1.22.0-1The ATK accessibility toolkit ii libc62.7-18 GNU C Library: Shared libraries ii libcairo21.6.4-7 The Cairo 2D vector graphics libra ii libgcc1 1:4.3.2-1.1 GCC support library ii libglib2.0-0 2.16.6-1The GLib library of C routines ii libgtk2.0-0 2.12.11-4 The GTK+ graphical user interface ii libpango1.0-01.20.5-3Layout and rendering of internatio ii libpng12-0 1.2.27-2PNG library - runtime ii libstdc++6 4.3.2-1.1 The GNU Standard C++ Library v3 Versions of packages eboard recommends: ii sox 14.0.1-2+b1 Swiss army knife of sound processi ii xfonts-75dpi 1:1.0.0-4 75 dpi fonts for X Versions of packages eboard suggests: ii crafty19.15-1State-of-the-art chess engine, com ii eboard-extras-pack1 2-2additional piece sets and sounds f ii gnuchess 5.07-4.1 Plays a game of chess, either agai - -- no debconf information -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEAREDAAYFAkl1exkACgkQJPE+P+aMAJJ8XgCgmARG+bpVMurkqFFHp24YQnUQ wKQAnRSZ+oQ4OdybWaLL8frhf+CMNglM =yQEM -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org