Elliotte Rusty Harold <elharo at metalab dot unc dot edu> wrote: >> You can get Unicode-format mapping tables for TIS 620 and many other >> encodings at http://crl.nmsu.edu/~mleisher/csets.html > > Thanks. Looking at that, it appears the mapping is imperfect. There > are about 10 characters in TIS-620 that are mapped to the Unicode > replacement character. This is from 1998 though. Has Unicode's Thai > support improved any in later versions?
These 9 code positions (0xA0, 0xDB..0xDE, 0xFC..0xFF) appear to be undefined in TIS 620.2533. Reference [3] below does show a "word separator character" at 0xDC, which I interpret as U+200B ZERO WIDTH SPACE, but the other positions are still undefined. So this may not be a case of Unicode having to play catch-up with the Thai standard. -Doug Ewell Fullerton, California [1] http://www.langbox.com/codeset/tis620.html [2] http://www.nectec.or.th/it-standards/std620/std620.htm [3] http://www.tru64unix.compaq.com/docs/base_doc/DOCUMENTATION/V40E_HTML/SU PPDOCS/THAIDOC/THAICH2.HTM