Dawn Friedland wrote:

>Prior to my client requesting that I add Japanese content to the content
>tool & database, I had zero experience with characters sets other than
>Latin. I always used notepad to filter out any weird MS Word formattings
>and left the default as ANSI. 
>
I had that problem a year ago too, prior to doing Japanese database work.

>Many people have recommended I use UTF-8. I interpreted that to mean
>that when I have the Japanese text in notepad, I choose file, save as,
>and then choose the encoding ast UTF-8. When I do that, and then
>copy/paste to insert using the DOS prompt, I get the same problematic
>results. Is there something I am missing or not understanding when
>people tell me to "use UTF-8" .... Am I supposed to configure the table
>or database somehow to use it or should I be running the text through a
>UTF-8 converter other than notepad? 
>  
>
I wouldn't rely on your command prompt to be UTF-8 compliant; I'd 
recommend inserting data using a web interface if nothing else (or your 
own Unicode-compatible client) to a BINARY field (not TEXT) unless you 
have MySQL with Unicode support.  Treat the data as binary _everywhere_; 
pretend you can't translate it, etc. except using safe tools (like the 
iconv library on *nix).  UTF-8 is just an encoding of Unicode; you may 
get more milage in Windows using 16-bit Unicode.

See: http://www.unicode.org/ for reference, especially 
http://www.unicode.org/unicode/faq/basic_q.html.

To best deal with UTF-8 in a program, use dynamically-allocated strings 
and never assume things like the 4th char in a string is string[3] or 
anything.  "Pass-through" is the best way to deal with UTF-8 until you 
actually have to handle processing of it (doing something to a 
Unicode/UTF-8 string) -- read it from a Unicode-compliant program / 
field / widget and write it straight to the DB without translations, 
then read it when you need it and compare it against something if 
necessary and display it.  Just because it looks like garbage when its 
raw doesn't mean it _is_ garbage.

Others may have other tips ...

-- 
Michael T. Babcock
C.T.O., FibreSpeed Ltd.
http://www.fibrespeed.net/~mbabcock



---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to