> > There is authoritative source for the Big5 encoding, but don't believe > > that do help > > > > http://www.cns11643.gov.tw/AIDB/encodings_en.do > > > > Skip the historical mess already done. we should focus on reality? > > > > brief events according time-line, > > > > * BIG5 created, mostly by ETEN company, some others but not important > > now. > > * CNS Standard like 11643, Taiwan Government authority building in > > mean time ... > > * Windows 3 showup, need Chinese ... pick not CNS but BIG5 ??? > > Code Page 950 born. > > * ETEN company add "ETen-extension 0xF9D6-0xF9FE" to work with IBM5550 > > * Since Windows ME, CP950 add above mentioned 7 char. 0xF9D6-0xF9FE > > ONLY ??? > > * Later Hong Kong add above 7 Char. plus some more symbol in > > HKSCS-2004, and what you found is right. > > * WHAT A MESS ! > > > > Focus on reality, > > only mentioned 7 Char. I need to build into pgsql sources to compliant > > with CP950, since few years ago. > > Ok, so Windows codepage 950 has those 7 characters, but not the other > ETEN extended chars. I think that's a good enough reason to add those 7 > chars; we have 'win950' as an alias for big5 anyway. > > I'll go add those characters.
Be very careful not to break the standard defined by Unicode. For example the glyph for 0xe7a281 == U+7881 is defined in page 43 of http://unicode.org/charts/PDF/U4E00.pdf. So we need to make sure that the particular kanji character defined in Big5 0xf9d6 has the same glyph as the one defined in Unicode(U+7881). Same thing can be said to rest of the proposed mapping. > {0xf9d6, 0xe7a281}, > {0xf9d7, 0xe98ab9}, > {0xf9d8, 0xe8a38f}, > {0xf9d9, 0xe5a2bb}, > {0xf9da, 0xe68192}, > {0xf9db, 0xe7b2a7}, > {0xf9dc, 0xe5abba}, -- Tatsuo Ishii SRA OSS, Inc. Japan - Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs