RE: Multiple languages in the same column

2003-12-12 Thread Yayati Kasralikar

We are using MySQL 4.1.1 with mysql-connector-java-3.1 JDBC Driver(nightly
snapshot).

The only way we can store and display the Unicode content is by specifying
it in the jdbc connection string(url) like:
jdbc:mysql://localhost/database_name?useUnicode=truecharacterEncoding=UTF-8

AND specifying the table/column type  as CHARACTER SET utf8. e.g.:

create table test_table (col1 VARCHAR(10) CHARACTER SET utf8)

None of the followings we tried work, if we do not specify it in the jdbc
connection string(url) like:
jdbc:mysql://localhost/database_name?useUnicode=truecharacterEncoding=UTF-8

1.We tried to set database's default character set to UTF-8 like:
mysqlalter database database_name default character set utf8;

2.We tried to specify the table/column type as CHARACTER SET utf8

3.We tried to set the default character set to utf-8 in the my.ini by adding
the following line:
default-character-set=utf8

Is my understanding correct?

Thanks

-Yayati

-Original Message-
From: Mark Matthews [mailto:[EMAIL PROTECTED]
Sent: Thursday, December 11, 2003 11:43 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Multiple languages in the same column


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Puny Sen wrote:
 Hi All,

 I'd like to use the same column to store content from multiple languages
 (English, German, French, Japanese).

 Here is my understanding of the options available.

 In MySQL 4.0:

 - UTF-8 is not currently available as a charset

True.

 - we can connect to the database using
 useUnicode=truecharacterEncoding=UTF-8 in the connection string.

True.

 - this enables us to store, search and retrieve Unicode content from the
 column, as long as we always use JDBC with the above connection string, to
 interact with the db.

True.

 - sorting will not work on the column

True.


 In MySQL 4.1:

 - UTF-8 is available as a charset

Yes, but remember, UTF-8 is an _encoding_ that can store many different
character sets, there is a difference.

 - We still neet to connect to the database using the above connection
string
 (doesn't seem to work otherwise)

Unless you set your database's default character set to UTF-8, then yes,
you do still need to have 'useUnicode=truecharacterEncoding=UTF-8' in
your URL, which tells the driver that you will be mixing character sets
in your queries (so encode them as UTF-8), and also tells the server to
expect your queries to be encoded in UTF-8 (the driver does a 'SET NAMES
UTF-8' on connect in this case).

 - sorting will work, but only using the general utf8 collation (may
not work
 for Japanese?). More collations will be available soon.

True. If you know the column charset and collation that you want to use,
you should be able to use CAST on it to get it to a different charset,
and the sort using a compatible collation.

 - [can we cast/convert to a different charset (sjis) and use its collation
 for sorting? (performance is not really an issue)]

I guess I just answered that above :)


 Please let me know if any of these assumptions are incorrect.

They seem to be correct. Please let me know if you run into any issues
or inconsistencies with these assumptions, because the combination of
Unicode and UTF-8 support in the JDBC driver and the server is new (and
can in sometimes be complex, due to the flexibility it offers), and we'd
like to get any kinks worked out ASAP!

-Mark

- --
Mr. Mark Matthews
MySQL AB, Software Development Manager, J2EE and Windows Platforms
Office: +1 708 557 2388
www.mysql.com

Are you MySQL Certified?
http://www.mysql.com/certification/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/2J6ItvXNTca6JD8RAp3BAJ9sWug9JcCeqWrDGzg6XGc2bUTaWwCgxcap
SRKikpcyoo0St5ClUF9G4Dw=
=QaD8
-END PGP SIGNATURE-

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Multiple languages in the same column

2003-12-12 Thread Mark Matthews
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Yayati Kasralikar wrote:

 We are using MySQL 4.1.1 with mysql-connector-java-3.1 JDBC Driver(nightly
 snapshot).

 The only way we can store and display the Unicode content is by specifying
 it in the jdbc connection string(url) like:

jdbc:mysql://localhost/database_name?useUnicode=truecharacterEncoding=UTF-8

 AND specifying the table/column type  as CHARACTER SET utf8. e.g.:

 create table test_table (col1 VARCHAR(10) CHARACTER SET utf8)

 None of the followings we tried work, if we do not specify it in the jdbc
 connection string(url) like:

jdbc:mysql://localhost/database_name?useUnicode=truecharacterEncoding=UTF-8

 1.We tried to set database's default character set to UTF-8 like:
 mysqlalter database database_name default character set utf8;

 2.We tried to specify the table/column type as CHARACTER SET utf8

 3.We tried to set the default character set to utf-8 in the my.ini by
adding
 the following line:
 default-character-set=utf8

 Is my understanding correct?

Yes. If you are going to mix character sets in the queries you are
_sending_ to the server from the JDBC driver, then you need to tell the
client to use UTF-8, like you have in your URL. I'll work on adding the
ability for the driver to autodetect when your server's 'charset_client'
is 'UTF-8'.


-Mark
- --
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




- --
Mr. Mark Matthews
MySQL AB, Software Development Manager, J2EE and Windows Platforms
Office: +1 708 557 2388
www.mysql.com

Are you MySQL Certified?
http://www.mysql.com/certification/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/2hUYtvXNTca6JD8RAhwoAKCKiK2No/++X2A6xqIRl0QuEcymbQCfSiQ+
fXuh3fYyeTJ97DAVIDGstOM=
=ocUZ
-END PGP SIGNATURE-

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Multiple languages in the same column

2003-12-12 Thread Yayati Kasralikar
Hello Mark,

Thanks for your help. I have one more question.

I am using the some tables with utf8 character set and some tables with
latin1 character set. I am using the jdbc connection string from the
properties file with the characterEncoding=UTF-8. I am not changing the jdbc
connection string, but I need to access these tables with different
character sets. I am getting the following error:
Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and
(utf8_general_ci,COERCIBLE) for operation '='
I can make all my tables utf8 to make my application work.

Is this is only choice I have?

Thank you,

-Yayati



-Original Message-
From: Mark Matthews [mailto:[EMAIL PROTECTED]
Sent: Friday, December 12, 2003 2:21 PM
To: Yayati Kasralikar
Cc: [EMAIL PROTECTED]
Subject: Re: Multiple languages in the same column


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Yayati Kasralikar wrote:

 We are using MySQL 4.1.1 with mysql-connector-java-3.1 JDBC Driver(nightly
 snapshot).

 The only way we can store and display the Unicode content is by specifying
 it in the jdbc connection string(url) like:

jdbc:mysql://localhost/database_name?useUnicode=truecharacterEncoding=UTF-8

 AND specifying the table/column type  as CHARACTER SET utf8. e.g.:

 create table test_table (col1 VARCHAR(10) CHARACTER SET utf8)

 None of the followings we tried work, if we do not specify it in the jdbc
 connection string(url) like:

jdbc:mysql://localhost/database_name?useUnicode=truecharacterEncoding=UTF-8

 1.We tried to set database's default character set to UTF-8 like:
 mysqlalter database database_name default character set utf8;

 2.We tried to specify the table/column type as CHARACTER SET utf8

 3.We tried to set the default character set to utf-8 in the my.ini by
adding
 the following line:
 default-character-set=utf8

 Is my understanding correct?

Yes. If you are going to mix character sets in the queries you are
_sending_ to the server from the JDBC driver, then you need to tell the
client to use UTF-8, like you have in your URL. I'll work on adding the
ability for the driver to autodetect when your server's 'charset_client'
is 'UTF-8'.


-Mark
- --
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




- --
Mr. Mark Matthews
MySQL AB, Software Development Manager, J2EE and Windows Platforms
Office: +1 708 557 2388
www.mysql.com

Are you MySQL Certified?
http://www.mysql.com/certification/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/2hUYtvXNTca6JD8RAhwoAKCKiK2No/++X2A6xqIRl0QuEcymbQCfSiQ+
fXuh3fYyeTJ97DAVIDGstOM=
=ocUZ
-END PGP SIGNATURE-

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Multiple languages in the same column

2003-12-11 Thread Mark Matthews
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Puny Sen wrote:
 Hi All,

 I'd like to use the same column to store content from multiple languages
 (English, German, French, Japanese).

 Here is my understanding of the options available.

 In MySQL 4.0:

 - UTF-8 is not currently available as a charset

True.

 - we can connect to the database using
 useUnicode=truecharacterEncoding=UTF-8 in the connection string.

True.

 - this enables us to store, search and retrieve Unicode content from the
 column, as long as we always use JDBC with the above connection string, to
 interact with the db.

True.

 - sorting will not work on the column

True.


 In MySQL 4.1:

 - UTF-8 is available as a charset

Yes, but remember, UTF-8 is an _encoding_ that can store many different
character sets, there is a difference.

 - We still neet to connect to the database using the above connection
string
 (doesn't seem to work otherwise)

Unless you set your database's default character set to UTF-8, then yes,
you do still need to have 'useUnicode=truecharacterEncoding=UTF-8' in
your URL, which tells the driver that you will be mixing character sets
in your queries (so encode them as UTF-8), and also tells the server to
expect your queries to be encoded in UTF-8 (the driver does a 'SET NAMES
UTF-8' on connect in this case).

 - sorting will work, but only using the general utf8 collation (may
not work
 for Japanese?). More collations will be available soon.

True. If you know the column charset and collation that you want to use,
you should be able to use CAST on it to get it to a different charset,
and the sort using a compatible collation.

 - [can we cast/convert to a different charset (sjis) and use its collation
 for sorting? (performance is not really an issue)]

I guess I just answered that above :)


 Please let me know if any of these assumptions are incorrect.

They seem to be correct. Please let me know if you run into any issues
or inconsistencies with these assumptions, because the combination of
Unicode and UTF-8 support in the JDBC driver and the server is new (and
can in sometimes be complex, due to the flexibility it offers), and we'd
like to get any kinks worked out ASAP!

-Mark

- --
Mr. Mark Matthews
MySQL AB, Software Development Manager, J2EE and Windows Platforms
Office: +1 708 557 2388
www.mysql.com

Are you MySQL Certified?
http://www.mysql.com/certification/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/2J6ItvXNTca6JD8RAp3BAJ9sWug9JcCeqWrDGzg6XGc2bUTaWwCgxcap
SRKikpcyoo0St5ClUF9G4Dw=
=QaD8
-END PGP SIGNATURE-

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Multiple languages in the same column

2003-12-10 Thread Puny Sen
Hi All,

I'd like to use the same column to store content from multiple languages
(English, German, French, Japanese).

Here is my understanding of the options available.

In MySQL 4.0:

- UTF-8 is not currently available as a charset
- we can connect to the database using
useUnicode=truecharacterEncoding=UTF-8 in the connection string.
- this enables us to store, search and retrieve Unicode content from the
column, as long as we always use JDBC with the above connection string, to
interact with the db.
- sorting will not work on the column

In MySQL 4.1:

- UTF-8 is available as a charset
- We still neet to connect to the database using the above connection string
(doesn't seem to work otherwise)
- sorting will work, but only using the general utf8 collation (may not work
for Japanese?). More collations will be available soon.
- [can we cast/convert to a different charset (sjis) and use its collation
for sorting? (performance is not really an issue)]

Please let me know if any of these assumptions are incorrect.

Thanks,
Puny Sen


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]