To go along with what Rick is saying, this link might help you: 
http://dba.stackexchange.com/questions/10467/how-to-convert-control-characters-in-mysql-from-latin1-to-utf-8

I remember doing a bunch of converting HEX() control characters (such as an 
apostrophe copied from a Word document) before attempting the SET NAMES.

Derek Downey


On Sep 24, 2012, at 7:53 PM, Rick James wrote:

> If you have a mixture of encodings, you are in deep doodoo.
> 
> This page describes some debugging techniques and some issues:
>   http://mysql.rjweb.org/doc.php/charcoll
> 
> That apostrophe might be MicroSquish's "smart quote".
> 
> Can you provide SELECT HEX(the_field) FROM... ?  We (or the above page) might 
> be able to interpret the character.
> 
> To prevent future char set issues, you must know what encoding the source is. 
>  Then, with SET NAMES (etc), you tell mysqld that the bytes you have in hand 
> are encoded that way.  mysqld will then convert those bytes to the character 
> set of declared for the column they go in.  (Presumably, all the text columns 
> will be declared utf8 or utf8mb4.)
> 
>> -----Original Message-----
>> From: Mark Phillips [mailto:m...@phillipsmarketing.biz]
>> Sent: Monday, September 24, 2012 4:28 PM
>> To: Mysql List
>> Subject: Need Help Converting Character Sets
>> 
>> I have a table, Articles, of news articles (in English) with three text
>> columns for the intro, body, and caption. The data came from a web
>> page, and the content was cut and pasted from other sources. I am
>> finding that there are some non utf-8 characters in these three text
>> columns. I would like to (1) convert these text fields to be strict
>> utf-8 and then (2) fix the input page to keep all new submissions utf-
>> 8.
>> 
>> 91) For the first step, fixing the current database, I tried:
>> 
>> update Articles set body = CONVERT(body USING ASCII);
>> 
>> However, when I checked one of the articles I found an apostrophe had
>> been converted into a question mark. (FWIW, the apostrophe was one of
>> those offending non utf-8 characters):
>> 
>> Before conversion: "I stepped into the observatory's control room ..."
>> 
>> After conversion: "I stepped into the observatory?s control room..."
>> 
>> Is there a better way to accomplish my first goal, without reading each
>> article and manually making the changes?
>> 
>> (2) For the second goal, insuring that all future articles are utf-8,
>> do I need to change the table structure or the insert query to insure I
>> get the correct utf-8 characters into the database?
>> 
>> Thanks,
>> 
>> Mark
> 
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql
> 

Reply via email to