Barley <[EMAIL PROTECTED]> wrote on 17/09/2004 15:17:11: > Say, for example, I want to run an insert like the following: > > java.sql.Statement select = conn.createStatement(); > select.executeUpdate("update test set observerNote='\u201C ... \u00BC'"); > > FWIW, u201C is an opening curly quote and u00BC is a fraction representing > one quarter. > > If I create my JDBC url like this: > > jdbc:mysql://localhost/test?user=test&password=test&useUnicode=true&characte > rEncoding=cp1250 > > then the curly quote is successfully inserted, but not the 'one quarter' > symbol. However, if I create the url in this way: > > jdbc:mysql://localhost/test?user=test&password=test&useUnicode=true&characte > rEncoding=latin1 > > then the 'one quarter' is inserted but not the curly quotes. I understand > that the latin1 character set includes the 'one quarter' symbol, but not the > curly quote and that the cp1250 character set includes the curly quote but > not the 'one quarter' symbol, but I want a way where I don't have to choose > a single limited pool of characters. > > How can I insert a String that contains both characters? Isn't there a way > to enable JDBC/MySql ConnectorJ to be able to insert Strings containing any > combination of Unicode characters? > > Many thank to anyone who can clarify this issue.
This answer is stretching my knowledge of character sets, but may help you - and if someone corrects me, will help me too. Latin1 and cp1250 (which seems to be the same as latin2) are both 8-bit character sets. By selecting them, you are telling MySQL to map down from the 16-bit Unicode set to one of two different, and incompatible, 8-bit character sets, then to map back up again on retrieval. When it maps down from Unicode to latinX, characters which have no mapping in that character set are, I think, converted to the standard "unknown character" symbol, and thus lost. What you actually want is true 16-bit storage, and for this you need to specify a true 16-bit character set. As I understand it, there are two such character sets: UTF-8 and UCS-2. Either of those will store both your extended characters. Which you use depencds on your exact needs. If you are largely storing latin text with a few funny characters, you probably want utf-8. If you are laregely storing non-latin characters, you probably want UCS-2. If you have not already done so, I suggest you study the manual page on the difference between Character sets and Collations. It is not simple, but it is very logical, and when you understand it, it makes this sort of pr0blem much easier. If you are only using Java, it is much the easiest to stick to one of the two 16-bit character sets and just change collation if you need to. If you need to mix Java with 8-bit languages such as C/C++, it gets more complicated. Alec -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]