Re: loading 0x00A0 into mysql
Hi, Can you try passing the dump file through hexdump or some binary editor to see if the data is there. Most text editors will treat 0x00 as end of string and this most likely this is causing the problem. Additionally you can try running the import with --default-character-set=utf8 in case the default charset is something else like this mysql --default-character-set=utf8 --user=me test_database dump_file Dobromir Thanks for the reply, and I apologize because I expect I've broken threading. The list isn't mailing the posts to me, so I've nothing to reply to. I've had to cut and paste from the web archive... 2014/01/06 12:18 +, Dave Howorth Everything appears to work except that text fields containing a Unicode non-breaking space (0x00A0) are truncated just before that character. I can see the field in the dump file and it looks OK, but it doesn't all make it into the new database. Well, there are too many aspects to this, but the first is the character set that mysql expects for input. If, say, it is USASCII (note that between the character set that mysql takes for input and the character set in the table no association is needful), the nbsp is out of range. Hmm, is there any way to tell what character set mysql expects, or better yet to tell it what to read? Or can I tell mysqldump to encode its output differently? (I promise to RTFM, but want to get this question out there whilst I'm reading!) (It is, of course, not nice if mysqldump yields an output that mysql cannot read.) Indeed; I'd go so far as to call that a bug. But that does seem to be what's happening. Try entering it with some escape-sequence (this one is based on the original SQL with features from PL1, not from C, which MySQL supports if 'ANSI' is in sql_mode): I don't understand the 'sql_mode', though I expect I can look that up too. But I did try these: 'some text ... ' || X'A0' || ' ... more text ...' causes the contents of the field to be '1'. or (slightly less PL1) CONCAT('some text ... ', X'A0', ' ... more text ...') Produces the same effect as embedding the character directly. i.e. the value of the field is truncated just before the problem character. However, substituting for the character with the string 'nbsp;' does allow mysql to read past it. I've now discovered that it also blows up on some other characters with the top bit set such as 0x91. What's strange about that is that they used to work. So my first thought now is that something has changed recently. Perhaps an update to one of the servers or clients involved? I don't remember changing anything in my code, but I can't be absolutely sure. Cheers, Dave -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
loading 0x00A0 into mysql
Hello, I'm trying to make a copy of a mysql database and having an issue loading fields containing a particular character. I've taken a dump using mysqldump and then try to load it into a new database using mysql --user=me test_database dump_file Everything appears to work except that text fields containing a Unicode non-breaking space (0x00A0) are truncated just before that character. I can see the field in the dump file and it looks OK, but it doesn't all make it into the new database. My Unicode foo is not strong, and it's my first day back at work :( so I'm hoping somebody recognises this problem immediately. What I do know: Server version: 5.5.34-0ubuntu0.12.04.1 (Ubuntu) Extract from dump_file including offending character: -- Table structure for table `text_for_pages` -- DROP TABLE IF EXISTS `text_for_pages`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `text_for_pages` ( `page` text NOT NULL, `keytext` text NOT NULL, `value` text, PRIMARY KEY (`page`(30),`keytext`(30)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */; -- -- Dumping data for table `text_for_pages` -- LOCK TABLES `text_for_pages` WRITE; /*!4 ALTER TABLE `text_for_pages` DISABLE KEYS */; INSERT IGNORE INTO `text_for_pages` (`page`, `keytext`, `value`) VALUES ('About','left-column','text value'),('About','main-column','some text ... A0 ... more text ...'); /*!4 ALTER TABLE `text_for_pages` ENABLE KEYS */; The value created in the database is just 'some text ... ' Cheers, Dave -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: loading 0x00A0 into mysql
2014/01/06 12:18 +, Dave Howorth Everything appears to work except that text fields containing a Unicode non-breaking space (0x00A0) are truncated just before that character. I can see the field in the dump file and it looks OK, but it doesn't all make it into the new database. Well, there are too many aspects to this, but the first is the character set that mysql expects for input. If, say, it is USASCII (note that between the character set that mysql takes for input and the character set in the table no association is needful), the nbsp is out of range. (It is, of course, not nice if mysqldump yields an output that mysql cannot read.) Try entering it with some escape-sequence (this one is based on the original SQL with features from PL1, not from C, which MySQL supports if 'ANSI' is in sql_mode): 'some text ... ' || X'A0' || ' ... more text ...' or (slightly less PL1) CONCAT('some text ... ', X'A0', ' ... more text ...') -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: loading 0x00A0 into mysql
Thanks for the reply, and I apologize because I expect I've broken threading. The list isn't mailing the posts to me, so I've nothing to reply to. I've had to cut and paste from the web archive... 2014/01/06 12:18 +, Dave Howorth Everything appears to work except that text fields containing a Unicode non-breaking space (0x00A0) are truncated just before that character. I can see the field in the dump file and it looks OK, but it doesn't all make it into the new database. Well, there are too many aspects to this, but the first is the character set that mysql expects for input. If, say, it is USASCII (note that between the character set that mysql takes for input and the character set in the table no association is needful), the nbsp is out of range. Hmm, is there any way to tell what character set mysql expects, or better yet to tell it what to read? Or can I tell mysqldump to encode its output differently? (I promise to RTFM, but want to get this question out there whilst I'm reading!) (It is, of course, not nice if mysqldump yields an output that mysql cannot read.) Indeed; I'd go so far as to call that a bug. But that does seem to be what's happening. Try entering it with some escape-sequence (this one is based on the original SQL with features from PL1, not from C, which MySQL supports if 'ANSI' is in sql_mode): I don't understand the 'sql_mode', though I expect I can look that up too. But I did try these: 'some text ... ' || X'A0' || ' ... more text ...' causes the contents of the field to be '1'. or (slightly less PL1) CONCAT('some text ... ', X'A0', ' ... more text ...') Produces the same effect as embedding the character directly. i.e. the value of the field is truncated just before the problem character. However, substituting for the character with the string 'nbsp;' does allow mysql to read past it. I've now discovered that it also blows up on some other characters with the top bit set such as 0x91. What's strange about that is that they used to work. So my first thought now is that something has changed recently. Perhaps an update to one of the servers or clients involved? I don't remember changing anything in my code, but I can't be absolutely sure. Cheers, Dave -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql