Hello Sven, usually we do not encourage to store anything other than ASCII/ISO-8859-1 in ASCII columns, as the charset itself is not stored in the database, and we made the experience that e.g. a wrongly or differently set locale may confuse lots of people with unwanted byte-strings conversions, and you may end up with data no one is really able to read or decipher, as the originally used encoding/charset is lost.
If you had tried to connect with unicode=yes to a pure ASCII database, you would have noticed it returns an error (i.e. that case isn't possible). So, for JDBC, we suggest using an UNICODE database if storing anything other than ASCII. (You may anyway create ASCII columns if you really need them also in an UNICODE database). Regards Alexander Schröder SAP Labs Berlin -----Original Message----- From: news [mailto:[EMAIL PROTECTED] On Behalf Of Sven Köhler Sent: Sonntag, 4. September 2005 18:50 To: [email protected] Subject: Client-APIs, unicode and charsets Hi, i always wondered, what the JDBC driver does if it isn't accessing a unicode-database. What is does, is quite inconvenient. AFAIK is assumes ISO-8859-1. IMHO, the charset used for binary data _should_ be a parameter of the connection-url and it should default to java's default charset - well, or to ISO-8859-1 if you like that better. On the other hand, there's another problem: What if i connect with "unicode=yes" to a non-unicode database? I guess, the MaxDB-kernel will convert the unicode-strings back to byte-strings - but which charset is used for that? I guess this questions also applies to writing strings to the database. IMHO the JDBC should default to "unicode=yes" but with an adjustable charset for all conversions from unicode- to byte-strings - even those conversions that take place in the MaxDB kernel. Non-unicode database are therefor currently unusable for JDBC-clients, if the applications that write into the database (non-JDBC-clients) don't use ISO-8859-1. In most cases, the charset these applications use will depend on their current environment (i.e. the locale). The main point is: i guess that byte-strings are copied into the database uncheck - i mean, you cannot assume that these strings are ISO-8859-1 or anythings else. They are just byte strings. On the other hand, currently unicode-database are currently unsuable for clients like DBD::MaxDB, ODBC (using the byte-string-API), ... All that i've said also applies to ODBC (if the ODBC-unicode API is used). The ODBC-driver doesn't accept any charset-parameter too. So any conversions that takes place will again be based on some charset that's either forced by the current locale-settings or hardcoded. Well, are you aware of all the problems? When will that change? Thanks Sven -- MaxDB Discussion Mailing List For list archives: http://lists.mysql.com/maxdb To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
