[GENERAL] UTF-8, upper() and Chinese characters yielding blank result

Scott Eade Thu, 27 Jul 2006 08:04:50 -0700

While I could see various multibyte issues in the archives and in theTODO list, I couldn't spot this exact issue:


I am working with a database that uses UNICODE encoding.

I have a varchar column (col_x) that includes a mix of Chinese andregular ASCII characters.

On PostgreSQL 7.4.13 (on RHEL4) "select col_x, upper(col_x) frommy_table" performs the desired upper() conversion - i.e. the ASCIIcharacters are converted to upper case and the Chinese characters areleft as is.

The problem appears on PostgreSQL 8.0.7 (on WinXP) where the upper()result is apparently blank (this is via pgAdmin III). Worde still, viaJDBC I am getting:java.sql.SQLException: Invalid character data was found. This ismost likely caused by stored data containing characters that are invalidfor the character set the database was created in. The most commonexample of this is storing 8bit data in a SQL_ASCII database.


Is this a bug or a change of behaviour between versions?

Is there some way I can get the 7.4.13 behaviour in 8.0.7?

TIA,

Scott

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

[GENERAL] UTF-8, upper() and Chinese characters yielding blank result

Reply via email to