lynx-dev Display of SGML Greek characters

Steve White Sun, 11 Aug 2002 22:05:03 -0700

This week I have written about the way that Lynx displays Greek
characters.  These characters are meant to display individual
Greek letters in mathematical and other technical documents.


Lynx, unfortunately, transliterates them into the Latin
equivalents, thus rendering them useless.

I proposed a solution, but it didn't work.  The idea was simply to
change the display of those characters to a textual representation
of the name of the character.  To do this, I tried modifying
character table files.

Character mapping in Lynx works like this.  It first maps each
SGML entity to a UTF-8 character.  The UTF-8 character is assigned
a 7-bit character string for display.

As it is, Lynx assigns the Greek SGML characters to the Unicode
characters for the Greek alphabet.  This is reasonable for a
browser that is capable of rendering the glyphs for Greek
characters distinctly from Latin ones.  But Lynx can't typically
do that.

By itself, this shouldn't have been a problem.  Unicode provides a
separate character range (1D400-1D7FF) for Greek Mathematical
symbols.  It should have been possible to map the SGML entities to
these Unicode characters, then specify 7-bit display strings for
those characters.

Those Greek Mathematical Unicode characters require 32-bit
representations.

Internally, Lynx must make some use of multi-byte UTF-8, because
one sees reasonable display of pages using 8-bit character
encodings.  It takes a 16-bit Unicode character to do that.

However, the file 'entities.h' only permits 16-bit representations
of SGML entities.  Perhaps this could be fixed.  But is Lynx
arranged internally to handle Unicode characters with beyond the
16-bit range?

I'm disappointed, but not thrilled at the prospect of a huge
amount of hacking to fix this one problem.

Perhaps one of you knows the code well enough to say whether the
problem is tractable.   Or maybe there's another solution I didn't
think of.  Or maybe it just doesn't matter that much.



; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to [EMAIL PROTECTED]

lynx-dev Display of SGML Greek characters

Reply via email to