> This is for anyone out there storing Japanese characters along with > English characters.
Hi. > SUMMARY: > The client recently requested that Japanese be stored in an otherwise > standard English (Latin) MySQL database. Whereas all rows in the table > used to be Latin only, now some rows store Latin and some store > Japanese. This should create no problems for MySQL. Drivers and other software between may do funny things. Hmm. I wonder if the text is surviving your paste buffer, if you aren't running the system in Japanese. In other words, I'm wondering if your text survives the paste from Word to your publishing tool. > I do not mix English with Japanese in the same row. Actually, with one or two exceptions, shift-JIS and euc-JIS should allow you to mix Japanese and English with no problems. Straight JIS would have problems, however, because it is two-byte-only. The problem characters are the ASCII backslash, which is the (half-width) yen symbol in shift-JIS, and the ASCII tilde, which is sometimes the (half-width) overbar in shift-JIS. I think euc-JIS does the same substitutions, but it's been several months since I messed with that. > Upon > writing Japanese data to the database (web form -> ASP -> MyODBC), and > then viewing the record on a web page (Shift-Jis), I discover that > random Japanese characters are being 'morphed' into other, seemingly > random, Japanese characters, and very occasionally, 'morphed' into a > Latin character (so far just the letter "t"). Can you catch what's going into IIS and what's coming out? Also, what's going into MyODBC and what's coming out of that? I vaguely recall that MyODBC sometimes coughs if not set up right. Say, how are you declaring your doctype? You know, the Content-Type header or meta-tag, or the XML doctype. See: HTML: http://www.w3.org/TR/html4/charset.html#h-5.2.2 Header: Content-Type: text/html; charset=EUC-JP or Meta-tag: <META http-equiv="Content-Type" content="text/html; charset=EUC-JP"> XML: http://www.w3c.org/TR/2000/REC-xml-20001006#sec-prolog-dtd <?xml version="1.0" encoding="shift-jis" ?> I think the driver may throw fits if you don't have the document type declared right. (See above about mixing character sets.) Hmm. You may find it easiest, if you are trying to display Japanese, Chinese, Korean, and English on the same page, to use Unicode UTF-8 throughout. > With the exception of > these few, random characters, all the Japanese data looks fine *when > displayed on a web page*. Worst comes to worst, post the problem characters and what they're supposed to be. (Or mail them to me direct.) I can take a look and see what bit patterns are causing problems, and that should yield some clues. > This is a standard install of MySQL version > 3.23.38-nt (on Windows 2000 SP2) - support for Japanese characters is > installed by default, I assume. Installed, but not selected. > I also store Chinese and Korean > characters in the same table, and those character sets are diplayed > without error. I would expect errors there too, unless you're using Unicode (UTF-8) for those. > Question 1. If I were to pull the Japanese rows out and put them in a > separate table - what do I do to the table to 'configure' it as storing > sjis characters without setting the default character set to the entire > database? Your version of MySQL does not support that. It's set in the my.ini or my.cnf configuration file for the whole database when you start MySQL up. So changing the settings for Japanese won't solve your problems unless you want to set up another instance (MSWindows, so that probably means another machine) of MySQL just for the Japanese. You shouldn't need to do that, however. The settings in my.cnf/my.ini are primarily for sort and collation order. (And error messages. Ick. There's a back-burner project I'd forgotten. Has the pan melted yet?) > Question 2. How do I view Japanese records in the command line *in > Japanese* to eliminate the possiblity that the culprit is somewhere > outside of MySQL, for example: Microsoft IIS or ASP or MyODBC? Sorry. You'll need to set up a machine running in Japanese to do that, as far as I know. Well, if you know how to redirect to a file, and if you have a text editor capable of displaying Japanese, that might get you a look at the text. But it might introduce some other unknowns, as well. I think you mentioned a colleague who can read Japanese? It might be worth your while to, oh, wait, the MSW2k box is your server, so you don't want to mess with that. It would be handy if the box your publishing tool runs on could be set up to boot the OS into either English or Japanese. (Mac OS X can set the language for the OS at log-in time, seems like MSW ought to be able to at least switch on boot.) > Question 3. How do I tell which charset MySQL is using, euc-jis or > s-jis? It's Latin, unless you've set the language in my.cng/my.ini. It's in the manual, section 4.6. http://www.mysql.com/doc/en/index.html (index page for the manual) HTH -- Joel Rees <[EMAIL PROTECTED]> --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php