From: Mark Matthews <[EMAIL PROTECTED]>
To: James Huang <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: Unicode characters become question marks
Date: Wed, 02 Jun 2004 13:04:38 -0500
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
James Huang wrote:
> Victor,
>
> I'm positive the database is storing ?'s. You may test with these steps:
>
> 1) insert "\u7247\u4EEE\u540D" into a UTF8 table;
James,
Have you set your JDBC driver's character set to be UTF-8 using the
characterEncoding property?
> 2) Query and get it back into string s;
> 3) for each char c in s: System.out.println((int)c);
Here's what I get (converting the chars to int to avoid any display
problems)...at least on my end, w/ Connector/J 3.0.14 and MySQL-4.1.x,
what I put in is what I get back out, so my guess is something between
the database and your display is munging the characters...Is whatever
you're using for output set to the correct encoding?:
As Java Unicode (int)chars:
7247
4eee
540d
Retrieved from database as (int)chars:
7247
4eee
540d
(full disclosure, here's my testcase):
public void testFoo() throws Exception {
Properties props = new Properties();
props.setProperty("characterEncoding", "utf-8");
Connection utf8Conn = getConnectionWithProps(props);
Statement utf8Stmt = utf8Conn.createStatement();
utf8Stmt.executeUpdate("DROP TABLE IF EXISTS testFoo");
utf8Stmt.executeUpdate("CREATE TABLE testFoo (field1 VARCHAR(32)
CHARACTER SET UTF8) CHARACTER SET UTF8");
utf8Stmt.executeUpdate("INSERT INTO testFoo VALUES
('\u7247\u4EEE\u540D')");
System.out.println("As Java Unicode (int)chars: ");
String asUnicode = "\u7247\u4EEE\u540D";
for (int i = 0; i < asUnicode.length(); i++) {
System.out.println(Integer.toHexString((int)asUnicode.charAt(i)));
}
System.out.println();
ResultSet rs = utf8Stmt.executeQuery("SELECT * FROM testFoo");
rs.next();
String utf8String = rs.getString(1);
System.out.println("Retrieved from database as (int)chars: ");
for (int i = 0; i < utf8String.length(); i++) {
System.out.println(Integer.toHexString((int)utf8String.charAt(i)));
}
}
- --
Mr. Mark Matthews
MySQL AB, Software Development Manager, J2EE and Windows Platforms
Office: +1 708 332 0507
www.mysql.com
MySQL Guide to Lower TCO
http://www.mysql.com/it-resources/white-papers/tco.php
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFAvha1tvXNTca6JD8RAiB6AJ9FGD0XHFwph8pBJSM5iBQeypbYfwCguIEV
kgjo+ZcICok1bdypNl82cVc=
=uRlQ
-----END PGP SIGNATURE-----
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]