I just upgraded from Debian 3 (Perl 5.8.4 and MySQL 4.1) to Debian 4 (Perl 5.8.8 and MySQL 5.0 and DBD::mysql 3.0008) and have the following problem:

In the following lines I create a table with a char field of length 4 and then try, using a perl script, to populate it with a string of 4 unicode characters, and see that only the 2 first characters have been stored, in a "double-encoded" form (thus taking the space of 4 characters). Needless to say, this is a huge problem.


First the table:

mysql> create table bbb (a int primary key auto_increment, b varchar(4));
Query OK, 0 rows affected (0.00 sec)


Then the perl script to populate the field:

#!/usr/bin/perl -w
use DBI;
my $dbh = DBI->connect("DBI:mysql:aaa", 'username', 'password', { RaiseError => 1 });
$dbh->do("insert into bbb set b = 'Αθήν'");


And then checking the result:

mysql> select * from bbb;
+---+------------+
| a | b          |
+---+------------+
| 1 | ΡÆÎ¸      |
+---+------------+
1 row in set (0.00 sec)


That was with default_character_set=utf8 under the [mysql] section of my.conf.

Commenting out that line and viewing the table again, we get:

mysql> select *, char_length(b) from bbb;
+---+------+----------------+
| a | b    | char_length(b) |
+---+------+----------------+
| 1 | Αθ |              4 |
+---+------+----------------+
1 row in set (0.00 sec)


i.e. we only got the first two letters in the table, but doubly-encoded to take up the space of 4 chars.

I'm desperate for a solution, a hint, or if you run Debian to please try these short scripts on your machine to tell me whether you're getting the same results (or better ones).

Thanks.

P.S. I'm 99.9% positive I've made sure the problem is not at my terminal's encoding, by uploading the perl script from another machine (that's known to have no problem) and inserting a 'use encoding "utf8";' pragma as well.

And thanks again.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to