On 3/9/2009 at 5:10 PM, Michael McCandless wrote: > OK this is now fixed. Thanks Steve!
You've proven wrong the assertion that getting encoding right is a thankless task :). Steve > Steven A Rowe wrote: > > > Hi Mike, > > > > On 3/9/2009 at 2:34 PM, Michael McCandless wrote: > >> See changes at > http://lucene.apache.org/java/2_4_1/changes/Changes.html > > > > Minor nit: the encoding of Christian Kohlschütter's name in the > > 2.4.1 section of CHANGES.txt appears to be Latin-1, but > > changes2html.pl assumes that CHANGES.txt is encoded as UTF-8, so the > > resulting Changes.html has an improperly encoded "ü" (lowercase "u" > > with an umlaut): > > > > 14. LUCENE-1186: Add Analyzer.close() to free internal ThreadLocal > > resources. > > (Christian Kohlsch�tter via Mike McCandless) > > > > For me, both in the web browser and in the excerpt from it that I've > > pasted above, instead of a lowercase "u" with an umlaut, I see a > > small white question mark on a black diamond background, indicating > > an invalid UTF-8 byte sequence: byte 0xFC, marking the beginning of > > a multi-byte sequence, but then no trailing bytes with the high bit > > set. > > > > Anyway, I think the fix is simple: edit CHANGES.txt so that > > "Kohlschütter" is properly encoded as UTF-8, as the remainder of the > > file is, then regenerate Changes.html. > > > > Steve
