@Andrew: there is a question for you below. As part of the comment style change I have been using "less" to look at a lot of our source code, and that revealed some corrupt characters (i.e., characters with the parity-bit set) in two of our files.
To fix those issues I have changed (as of revision 11274) all instances of 0xa0 (space with sign-bit set) to 0x20 (space) in drivers/tkwin.c, and I have completely removed all instances of 0x85 (ctrl-E with sign bit set) in drivers/wingcc.c. In all cases, the corrupted characters were in commentary or (in one case) in a menu string. These files were developed on Windows quite a few years ago, and I ascribe these corrupted characters to problematic editors or bad cvs commits back then. I was obviously concerned about the possibility these corrupted characters might represent a general problem in our source code. Therefore, I implemented (revision 11280) the utils/parity_bit_check application in the build tree (source code in utils/parity_bit_check.c) which is built by "make parity_bit_check" in the build tree. This application finds the first stdin character with parity bit set and returns that character as a return code (or returns 0 if there are no characters with parity bit set). That application is run by scripts/parity_bit_check.sh to check for such issues for all files in our source code except for those listed in scripts/parity_bit_check.exclude. Currently that file has the following patterns that are excluded from the check which I annotate here: # Exclude these because they would not be part of fresh checkout \.svn ~$ # exclude various image formats \.jpg$ \.gif$ \.cgm$ \.map$ \.fnt$ lena # Exclude UTF-8 files. The latter part of this subset (from NEWS on) has # recently been converted from latin1 to UTF-8 using iconv so that # developer's names that occur in those files will be rendered # correctly in the UTF-8 locale that is the norm these days. x2[46] xw2[46] xthick2[46] README.release README.deltaT.dat test_hebrew_diacritic.py test_plplot_encodings.py NEWS ChangeLog api2man.pl.in api2text.pl docbook/AUTHORS # latin1 encoding. (Andrew, is there any reason to keep this octave source file any more? The idea behind it is to approximately render latin1 characters from octave, but latin1 is extremely outdated now, and octave users would be much better off to use the default PLplot utf8 encoding for all user strings.) __pl_pltext.m # These files generated by some proprietary MS app which scattered # some "smart" MS characters throughout. I haven't bothered to # demoronize these files since they will likely be replaced in the # future in any case. qsastime.html qsastime.txt qsastime.xml # This php file was copied from some website by Werner and is used to give # us a good-looking newsfeed on our website. Some of the characters in # this file have their parity bit set (again probably thanks to our MS # friends). It works in its present form on our website so I didn't # bother to try and fix it. simplepie.inc There appear to be good reasons to exclude all the above files from the parity bit check. Running the script then showed two README* documentation files that needed to be fixed up (by replacing MS quotes by ordinary quotes). After that fixup, running the script reveals no remaining corrupted characters in our files, and in particular, there are no corrupted characters in our language source files. That is extremely good news, considering the large scope this problem _could_ have had. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); PLplot scientific plotting software package (plplot.org); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ ------------------------------------------------------------------------------ Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev _______________________________________________ Plplot-devel mailing list Plplot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/plplot-devel