Tatsuo Ishii wrote: > Following item in HISTORY: > > * Add support for 3 and 4-byte UTF8 characters (John Hansen) > Previously only one and two-byte UTF8 characters were supported. > This is particularly important for support for some Chinese > characters. > > is wrong since 3-byte UTF-8 characters are supported since UTF-8 > support has been added to PostgreSQL. Correct description would be: > > * Add support for 4-byte UTF8 characters (John Hansen) > Previously only up to three-byte UTF8 characters were supported. > This is particularly important for support for some Chinese > characters.
Release notes updated. > > In the mean time I wonder if we need to update UTF-8 <--> locale > encoding maps. The author of the patches stated that "This is > particularly important for support for some Chinese characters". I > have no idea what encoding he is reffering to, but I wonder if the > latest Chinense encoding standard GB18030 needs 4-byte UTF-8 mappings. > If yes, we surely need to update utf8_to_gb18030.map. > > Anybody familiar with GB18030/UTF-8? Good question. The report we got in the past was that some UTF characters were being rejected even though they were valid UTF characters, mostly Chinese. I have no idea how they map to GB* character sets. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend