Re: loading 0x00A0 into mysql

2014-01-07 Thread Dobromir Velev
Hi,
Can you try passing the dump file through hexdump  or some binary editor to 
see if the data is there. Most text editors will treat 0x00 as end of string 
and this most likely this is causing the problem.

Additionally you can try running the import with  --default-character-set=utf8 
in case the default charset is something else
 
like this 

mysql --default-character-set=utf8 --user=me test_database   dump_file


Dobromir

Thanks for the reply, and I apologize because I expect I've broken
threading. The list isn't mailing the posts to me, so I've nothing to
reply to. I've had to cut and paste from the web archive...

 2014/01/06 12:18 +, Dave Howorth 
 Everything appears to work except that text fields containing a
 Unicode non-breaking space (0x00A0) are truncated just before that
 character. I can see the field in the dump file and it looks OK, but
 it doesn't all make it into the new database.

 Well, there are too many aspects to this, but the first is the
 character set that mysql expects for input. If, say, it is USASCII
 (note that between the character set that mysql takes for input and
 the character set in the table no association is needful), the nbsp
 is out of range.

Hmm, is there any way to tell what character set mysql expects, or
better yet to tell it what to read? Or can I tell mysqldump to encode
its output differently?

(I promise to RTFM, but want to get this question out there whilst I'm
reading!)

 (It is, of course, not nice if mysqldump yields an output that
 mysql cannot read.)

Indeed; I'd go so far as to call that a bug. But that does seem to be
what's happening.

 Try entering it with some escape-sequence (this one is based on the
 original SQL with features from PL1, not from C, which MySQL supports
 if 'ANSI' is in sql_mode):

I don't understand the 'sql_mode', though I expect I can look that up
too. But I did try these:

 'some text ... ' || X'A0' || ' ... more text ...'

causes the contents of the field to be '1'.

 or (slightly less PL1)

 CONCAT('some text ... ', X'A0', ' ... more text ...')

Produces the same effect as embedding the character directly. i.e. the
value of the field is truncated just before the problem character.

However, substituting for the character with the string 'nbsp;' does
allow mysql to read past it. I've now discovered that it also blows up
on some other characters with the top bit set such as 0x91. What's
strange about that is that they used to work. So my first thought now is
that something has changed recently. Perhaps an update to one of the
servers or clients involved? I don't remember changing anything in my
code, but I can't be absolutely sure.

Cheers, Dave

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



loading 0x00A0 into mysql

2014-01-06 Thread Dave Howorth
Hello,

I'm trying to make a copy of a mysql database and having an issue
loading fields containing a particular character. I've taken a dump
using mysqldump and then try to load it into a new database using

  mysql --user=me test_database  dump_file

Everything appears to work except that text fields containing a Unicode
non-breaking space (0x00A0) are truncated just before that character. I
can see the field in the dump file and it looks OK, but it doesn't all
make it into the new database.

My Unicode foo is not strong, and it's my first day back at work :( so
I'm hoping somebody recognises this problem immediately.

What I do know:

Server version: 5.5.34-0ubuntu0.12.04.1 (Ubuntu)

Extract from dump_file including offending character:

-- Table structure for table `text_for_pages`
--

DROP TABLE IF EXISTS `text_for_pages`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `text_for_pages` (
  `page` text NOT NULL,
  `keytext` text NOT NULL,
  `value` text,
  PRIMARY KEY (`page`(30),`keytext`(30))
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `text_for_pages`
--

LOCK TABLES `text_for_pages` WRITE;
/*!4 ALTER TABLE `text_for_pages` DISABLE KEYS */;
INSERT  IGNORE INTO `text_for_pages` (`page`, `keytext`, `value`) VALUES
('About','left-column','text value'),('About','main-column','some text
... A0 ... more text ...');
/*!4 ALTER TABLE `text_for_pages` ENABLE KEYS */;


The value created in the database is just 'some text ... '

Cheers, Dave

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: loading 0x00A0 into mysql

2014-01-06 Thread hsv
 2014/01/06 12:18 +, Dave Howorth 
Everything appears to work except that text fields containing a Unicode
non-breaking space (0x00A0) are truncated just before that character. I
can see the field in the dump file and it looks OK, but it doesn't all
make it into the new database. 

Well, there are too many aspects to this, but the first is the character set 
that mysql expects for input. If, say, it is USASCII (note that between the 
character set that mysql takes for input and the character set in the table 
no association is needful), the nbsp is out of range. (It is, of course, not 
nice if mysqldump yields an output that mysql cannot read.) Try entering it 
with some escape-sequence (this one is based on the original SQL with features 
from PL1, not from C, which MySQL supports if 'ANSI' is in sql_mode):

'some text ... ' || X'A0' || ' ... more text ...'

or (slightly less PL1)

CONCAT('some text ... ', X'A0', ' ... more text ...')


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: loading 0x00A0 into mysql

2014-01-06 Thread Dave Howorth
Thanks for the reply, and I apologize because I expect I've broken
threading. The list isn't mailing the posts to me, so I've nothing to
reply to. I've had to cut and paste from the web archive...

 2014/01/06 12:18 +, Dave Howorth 
 Everything appears to work except that text fields containing a
 Unicode non-breaking space (0x00A0) are truncated just before that
 character. I can see the field in the dump file and it looks OK, but
 it doesn't all make it into the new database.

 Well, there are too many aspects to this, but the first is the
 character set that mysql expects for input. If, say, it is USASCII
 (note that between the character set that mysql takes for input and
 the character set in the table no association is needful), the nbsp
 is out of range.

Hmm, is there any way to tell what character set mysql expects, or
better yet to tell it what to read? Or can I tell mysqldump to encode
its output differently?

(I promise to RTFM, but want to get this question out there whilst I'm
reading!)

 (It is, of course, not nice if mysqldump yields an output that
 mysql cannot read.)

Indeed; I'd go so far as to call that a bug. But that does seem to be
what's happening.

 Try entering it with some escape-sequence (this one is based on the
 original SQL with features from PL1, not from C, which MySQL supports
 if 'ANSI' is in sql_mode):

I don't understand the 'sql_mode', though I expect I can look that up
too. But I did try these:

 'some text ... ' || X'A0' || ' ... more text ...'

causes the contents of the field to be '1'.

 or (slightly less PL1)

 CONCAT('some text ... ', X'A0', ' ... more text ...')

Produces the same effect as embedding the character directly. i.e. the
value of the field is truncated just before the problem character.

However, substituting for the character with the string 'nbsp;' does
allow mysql to read past it. I've now discovered that it also blows up
on some other characters with the top bit set such as 0x91. What's
strange about that is that they used to work. So my first thought now is
that something has changed recently. Perhaps an update to one of the
servers or clients involved? I don't remember changing anything in my
code, but I can't be absolutely sure.

Cheers, Dave

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql