Hi Mike,
On 3/9/2009 at 2:34 PM, Michael McCandless wrote:
> See changes at http://lucene.apache.org/java/2_4_1/changes/Changes.html
Minor nit: the encoding of Christian Kohlschütter's name in the 2.4.1 section
of CHANGES.txt appears to be Latin-1, but changes2html.pl assumes that
CHANGES.txt is encoded as UTF-8, so the resulting Changes.html has an
improperly encoded "ü" (lowercase "u" with an umlaut):
14. LUCENE-1186: Add Analyzer.close() to free internal ThreadLocal
resources.
(Christian Kohlsch�tter via Mike McCandless)
For me, both in the web browser and in the excerpt from it that I've pasted
above, instead of a lowercase "u" with an umlaut, I see a small white question
mark on a black diamond background, indicating an invalid UTF-8 byte sequence:
byte 0xFC, marking the beginning of a multi-byte sequence, but then no trailing
bytes with the high bit set.
Anyway, I think the fix is simple: edit CHANGES.txt so that "Kohlschütter" is
properly encoded as UTF-8, as the remainder of the file is, then regenerate
Changes.html.
Steve