This week I have written about the way that Lynx displays Greek characters. These characters are meant to display individual Greek letters in mathematical and other technical documents.
Lynx, unfortunately, transliterates them into the Latin equivalents, thus rendering them useless. I proposed a solution, but it didn't work. The idea was simply to change the display of those characters to a textual representation of the name of the character. To do this, I tried modifying character table files. Character mapping in Lynx works like this. It first maps each SGML entity to a UTF-8 character. The UTF-8 character is assigned a 7-bit character string for display. As it is, Lynx assigns the Greek SGML characters to the Unicode characters for the Greek alphabet. This is reasonable for a browser that is capable of rendering the glyphs for Greek characters distinctly from Latin ones. But Lynx can't typically do that. By itself, this shouldn't have been a problem. Unicode provides a separate character range (1D400-1D7FF) for Greek Mathematical symbols. It should have been possible to map the SGML entities to these Unicode characters, then specify 7-bit display strings for those characters. Those Greek Mathematical Unicode characters require 32-bit representations. Internally, Lynx must make some use of multi-byte UTF-8, because one sees reasonable display of pages using 8-bit character encodings. It takes a 16-bit Unicode character to do that. However, the file 'entities.h' only permits 16-bit representations of SGML entities. Perhaps this could be fixed. But is Lynx arranged internally to handle Unicode characters with beyond the 16-bit range? I'm disappointed, but not thrilled at the prospect of a huge amount of hacking to fix this one problem. Perhaps one of you knows the code well enough to say whether the problem is tractable. Or maybe there's another solution I didn't think of. Or maybe it just doesn't matter that much. ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to [EMAIL PROTECTED]
