Apologies if this is a repeat of a (much) earlier inquiry. The mapping tables that are available as part of the Unicode Standard (http://www.unicode.org/Public/MAPPINGS/) are generally provided in a text format called "Format A." Each line in the file defines a mapping between a character in a legacy encoding and the Unicode equivalent, with fields separated by tabs or sequences of spaces, like this: 0xA0 0x00A0 #NO-BREAK SPACE 0xA1 0x00A1 #INVERTED EXCLAMATION MARK 0xA2 0x00A2 #CENT SIGN The format supports DBCS as well: 0x8140 0x4E02 #CJK UNIFIED IDEOGRAPH 0x8141 0x4E04 #CJK UNIFIED IDEOGRAPH 0x8142 0x4E05 #CJK UNIFIED IDEOGRAPH My questions are: 1. Is there a specification for this format anywhere, and if so, where? 2. Is there a "Format B" or similar? (I don't mean UCM, CharMapML, RFC 1345 format, etc., but something truly similar to and/or derivative of Format A.) Please reply on-list only if you think the list at large would benefit from your reply. I'm hoping some of the Unicode elders might have some insight here. -- Doug Ewell | Thornton, CO, US | ewellic.org