Re: Data truncated during character set conversion

2006-07-04 Thread Timur Izhbulatov
On Mon, Jul 03, 2006 at 01:55:26PM +0400, Timur Izhbulatov wrote:
 I'm following the instructions [1] to convert character set. Unfortunately I 
 get
 warnings about truncated data for some rows in several columns. All the
 truncated columns are text type.
 
 Assuming the `col' column is text type and actually contains correct utf8 data
 but has wrong character set I use the following queries to fix it:
 
 ALTER TABLE table MODIFY col BLOB;
 ALTER TABLE table MODIFY col TEXT CHARACTER SET utf8;
 
 After the second query I get warnings about truncated data for some rows and I
 can actually see the rows truncated. I played a bit with different data types
 (LONGBLOB/LONGTEXT) but wasn't successful.
 
 So my question is what can be the cause of the data loss and how to avoid it?

After some additional investigations I found the cause. It was some exotic
non-ASCII characters like '–' (long dash). At the same time Russian letters
don't cause any problems.

Seems the problem arises because the table itself also has wrong default
character set (latin1). Setting default character set to utf8 with ALTER TABLE
doesn't solve the problem. Bug if I create a new table with utf8 as default
character set convertion works fine.

What's happening? Please see the testcase attached.

My character set settings are:

SHOW VARIABLES LIKE '%char%';
+--+--+
| Variable_name| Value|
+--+--+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database   | utf8 |
| character_set_results| utf8 |
| character_set_server | latin1   |
| character_set_system | utf8 |
| character_sets_dir   | /usr/local/share/mysql/charsets/ |
+--+--+

Cheers,
-- 
Timur Izhbulatov
OILspace, 26 Leninskaya sloboda, bld. 2, 2nd floor, 115280 Moscow, Russia
P:+7 495 105 7245 + ext.205 F:+7 495 105 7246 E:[EMAIL PROTECTED]
Building Successful Supply Chains - One Solution At A Time.
www.oilspace.com
-- MySQL dump 10.9
--
-- Host: localhostDatabase: tizhbulatov_aquarium
-- --
-- Server version   4.1.19

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 
*/;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Table structure for table `stories`
--

DROP TABLE IF EXISTS `test_table`;
CREATE TABLE `test_table` (
  `overview` text
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

ALTER TABLE `test_table` DEFAULT CHARACTER SET utf8;

INSERT INTO `test_table` SET `overview` = '–';

SELECT * FROM `test_table`;

ALTER TABLE `test_table` MODIFY `overview` BLOB;

ALTER TABLE `test_table` MODIFY `overview` text CHARACTER SET utf8;
SHOW WARNINGS;

/*!40101 SET [EMAIL PROTECTED] */;
/*!40014 SET [EMAIL PROTECTED] */;
/*!40014 SET [EMAIL PROTECTED] */;
/*!40101 SET [EMAIL PROTECTED] */;
/*!40101 SET [EMAIL PROTECTED] */;
/*!40101 SET [EMAIL PROTECTED] */;
/*!40111 SET [EMAIL PROTECTED] */;


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Data truncated during character set conversion

2006-07-04 Thread Timur Izhbulatov
On Tue, Jul 04, 2006 at 01:55:30PM +0400, Timur Izhbulatov wrote:
 After some additional investigations I found the cause. It was some exotic
 non-ASCII characters like '–' (long dash). At the same time Russian letters
 don't cause any problems.

Sorry, I was wrong concerning Russian letters. Acutally *any* non-ASCII
character causes the problem. The problem seems to be limited only to TEXT
fields, non-ASCII characters (including Russian) in CHAR/VARCHAR fields are OK. 

-- 
Timur Izhbulatov
OILspace, 26 Leninskaya sloboda, bld. 2, 2nd floor, 115280 Moscow, Russia
P:+7 495 105 7245 + ext.205 F:+7 495 105 7246 E:[EMAIL PROTECTED]
Building Successful Supply Chains - One Solution At A Time.
www.oilspace.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Data truncated during character set conversion

2006-07-03 Thread Timur Izhbulatov
Hi all,

I'm following the instructions [1] to convert character set. Unfortunately I get
warnings about truncated data for some rows in several columns. All the
truncated columns are text type.

Assuming the `col' column is text type and actually contains correct utf8 data
but has wrong character set I use the following queries to fix it:

ALTER TABLE table MODIFY col BLOB;
ALTER TABLE table MODIFY col TEXT CHARACTER SET utf8;

After the second query I get warnings about truncated data for some rows and I
can actually see the rows truncated. I played a bit with different data types
(LONGBLOB/LONGTEXT) but wasn't successful.

So my question is what can be the cause of the data loss and how to avoid it?

Thanks in advance.

[1] http://dev.mysql.com/doc/refman/4.1/en/charset-conversion.html

Cheers,

-- 
Timur Izhbulatov
OILspace, 26 Leninskaya sloboda, bld. 2, 2nd floor, 115280 Moscow, Russia
P:+7 495 105 7245 + ext.205 F:+7 495 105 7246 E:[EMAIL PROTECTED]
Building Successful Supply Chains - One Solution At A Time.
www.oilspace.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]