Thanks, Mark. This instills great confidence in me.
I used this URL: "jdbc:mysql://localhost/mydb?useUnicode=true&characterEncoding=utf8"
(should I use "utf-8" perhaps?) Would this work, too?
What is
Connection utf8Conn = getConnectionWithProps(props);
in your test code? That doesn't look like a standard JDBC method.
-James
From: Mark Matthews <[EMAIL PROTECTED]> To: James Huang <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Unicode characters become question marks Date: Wed, 02 Jun 2004 13:04:38 -0500
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
James Huang wrote:
> Victor, > > I'm positive the database is storing ?'s. You may test with these steps: > > 1) insert "\u7247\u4EEE\u540D" into a UTF8 table;
James,
Have you set your JDBC driver's character set to be UTF-8 using the characterEncoding property?
> 2) Query and get it back into string s; > 3) for each char c in s: System.out.println((int)c);
Here's what I get (converting the chars to int to avoid any display problems)...at least on my end, w/ Connector/J 3.0.14 and MySQL-4.1.x, what I put in is what I get back out, so my guess is something between the database and your display is munging the characters...Is whatever you're using for output set to the correct encoding?:
As Java Unicode (int)chars: 7247 4eee 540d
Retrieved from database as (int)chars: 7247 4eee 540d
(full disclosure, here's my testcase):
public void testFoo() throws Exception { Properties props = new Properties(); props.setProperty("characterEncoding", "utf-8"); Connection utf8Conn = getConnectionWithProps(props); Statement utf8Stmt = utf8Conn.createStatement();
utf8Stmt.executeUpdate("DROP TABLE IF EXISTS testFoo"); utf8Stmt.executeUpdate("CREATE TABLE testFoo (field1 VARCHAR(32) CHARACTER SET UTF8) CHARACTER SET UTF8"); utf8Stmt.executeUpdate("INSERT INTO testFoo VALUES ('\u7247\u4EEE\u540D')");
System.out.println("As Java Unicode (int)chars: "); String asUnicode = "\u7247\u4EEE\u540D";
for (int i = 0; i < asUnicode.length(); i++) { System.out.println(Integer.toHexString((int)asUnicode.charAt(i))); }
System.out.println();
ResultSet rs = utf8Stmt.executeQuery("SELECT * FROM testFoo"); rs.next(); String utf8String = rs.getString(1);
System.out.println("Retrieved from database as (int)chars: ");
for (int i = 0; i < utf8String.length(); i++) { System.out.println(Integer.toHexString((int)utf8String.charAt(i))); } }
- -- Mr. Mark Matthews MySQL AB, Software Development Manager, J2EE and Windows Platforms Office: +1 708 332 0507 www.mysql.com
MySQL Guide to Lower TCO http://www.mysql.com/it-resources/white-papers/tco.php -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFAvha1tvXNTca6JD8RAiB6AJ9FGD0XHFwph8pBJSM5iBQeypbYfwCguIEV kgjo+ZcICok1bdypNl82cVc= =uRlQ -----END PGP SIGNATURE-----
-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]