Patches item #3420484, was opened at 2011-10-07 23:10 Message generated for change (Comment added) made by eighthave You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=478072&aid=3420484&group_id=55736
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: puredata Group: bugfix Status: Open Resolution: None Priority: 5 Private: No Submitted By: Marvin Humphrey (creamygoodness) >Assigned to: Miller Puckette (millerpuckette) Summary: Invalid memory access in s_utf8.c Initial Comment: After hearing on IRC that segfaults could be produced while editing text in an Object box under gdb, I tried running the Vanilla main git branch under Valgrind: valgrind ./src/pd Valgrind reported many "invalid read" memory errors, which on inspection, all arose from the same root cause: these strings are not NULL-terminated, but the UTF-8 handling code in s_utf8.c assumes NULL-termination and malfunctions in its absence. The attached patch suffices to eliminate the read errors. However, it does not address all the problems in s_utf8.c; other functions will require similar treatment. I have attempted make the new code as compatible with the old code as possible, for instance preserving its dubious algorithm for handling missing or excess continuation bytes. Hopefully I got everything right, though it is hard to be certain with bit-twiddling code like this in the absence of thorough unit tests. ---------------------------------------------------------------------- >Comment By: Hans-Christoph Steiner (eighthave) Date: 2011-10-09 21:39 Message: I accepted this into Pd-extended. ---------------------------------------------------------------------- Comment By: Marvin Humphrey (creamygoodness) Date: 2011-10-09 16:04 Message: I have uploaded a new version of the patch for s_utf8.c, and also a new version of the benchmarking app. Highlights: 1. The benchmarking app now handles four different functions. 2. All changed functions have been sped up since the last patch, especially u8_charnum(). u8_offset: 16% faster u8_charnum: 22% faster u8_inc: 10% faster ---------------------------------------------------------------------- Comment By: Hans-Christoph Steiner (eighthave) Date: 2011-10-09 12:54 Message: This sounds great, I'll include it in Pd-extended and we'll see how it does there. The only problem is that there are two patches included in this tracker and they seem to have the same name. Could you delete the older one, so there is only one patch? ---------------------------------------------------------------------- Comment By: Marvin Humphrey (creamygoodness) Date: 2011-10-08 20:08 Message: In response to speed concerns raised in an IRC conversation ("well, its a realtime engine that communicates thru the utf8 stuff, so its good to have it efficient"), I have prepared a benchmark program to test the performance of u8_offset(). It reads a file into RAM, then traverses it with u8_offset() a user-specified number of iterations. Here are some results using the French wikipedia page on Pure Data as source material. Without patch: marvin@smokey:~/projects/pure-data $ cc -Wall -Wextra -std=gnu99 -O3 bench.c src/s_utf8.c -o bench marvin@smokey:~/projects/pure-data $ time ./bench pd.html 10000 pd.html (44683 bytes) 10000 iterations real 0m0.732s user 0m0.716s sys 0m0.003s Running the test 10 times produced a range of 0.724s to 0.733s, so the benchmark was pretty stable even on a Mac. :) With patch: marvin@smokey:~/projects/pure-data $ cc -Wall -Wextra -std=gnu99 -O3 bench.c src/s_utf8.c -o bench marvin@smokey:~/projects/pure-data $ time ./bench pd.html 10000 pd.html (44683 bytes) 10000 iterations real 0m0.618s user 0m0.612s sys 0m0.004s So, in addition to eliminating the Valgrind warnings, the patched version is actually faster. I speculate that this is because the current version of u8_offset() needlessly traverses at least one extra byte when the header byte of the sequence is plain ASCII. For what it's worth, if we remove the NUL-termination check from the loop condition... - while (charnum > 0 && str[offs]) { + while (charnum > 0) { We get even better: marvin@smokey:~/projects/pure-data $ time ./bench pd.html 10000 pd.html (44683 bytes) 10000 iterations real 0m0.514s user 0m0.507s sys 0m0.004s However, that could yield bogus results in the event that strlen(string) differs from the buffer length held in a separate integer. I would go that route in code which had been engineered to not need NUL-termination from the start, but wouldn't advocate for it here. As a further test, I tried an implementation based on u8_seqlen() which has the advantage of being easier to understand, and is the algorithm currently used by the Perl core among others: while (charnum-- > 0 && str[offs]) { offs += u8_seqlen(str + offs); } It performed substantially worse: marvin@smokey:~/projects/pure-data $ time ./bench pd.html 10000 pd.html (44683 bytes) 10000 iterations real 0m1.906s user 0m1.891s sys 0m0.003s In this case, I speculate that the lookup into the trailingBytesForUTF8 array is costly, even though it doubtless gets into the hottest CPU cache and stays there. The benchmarks were run on a 2007 Macbook Pro with a 2.4 GHz Core Duo chipset running OS X 10.6 (Snow Leopard). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=478072&aid=3420484&group_id=55736 _______________________________________________ Pd-dev mailing list Pd-dev@iem.at http://lists.puredata.info/listinfo/pd-dev