Re: Storing foreign characters in DB

2004-09-06 Thread Monty
 From: [EMAIL PROTECTED]
 Date: Sun, 5 Sep 2004 22:39:39 -0500 (CDT)
 Subject: Re: Storing foreign characters in DB
 
 I am not using 3.xx versions anymore, but if I remember correctly they
 only allow a limited control for setting character sets.
 
 In order to be able to give you any advice you need to be more specific.
 
 Which character sets are set now for the server and the client as default ?
 Are the foreign characters in the same character set ?
 Are you attempting to store the foreign character together with your
 default characters in the same table / same DB ?
 
 Nils Valentin - Tokyo/Japan

Nils, thanks for your response.

I've done some reading recommended by you and Rhino (this is all new to me),
and have some more specific questions now. First off, I am using MySQL
3.23.58 with the following setup:

character set: latin1
character sets: latin1 big5 cp1251 cp1257 croat czech danish dec8 dos
estonia euc_kr gb2312 gbk german1 greek hebrew hp8 hungarian koi8_ru
koi8_ukr latin2 latin5 swe7 usa7 win1250 win1251 win1251ukr ujis sjis tis620

Here's what I believe is happening: foreign characters entered into a form
input field on a website are transmitted to my PHP script with UTF-8
encoding. When this form data is stored in the database as-is, I'm seeing
the strange double-characters when I pull it back out of the database
because MySQL is set for Latin1 encoding, which is a subset of ISO-8859-1
(right?). 

So, I think what I need to do is convert the character encoding of the data
sent by the HTML form to ISO-8859-1 first *before* I store it in the
database. When I do that and then retrieve it, foreign characters display
properly. Fortunately, PHP has a character-encoding translation function
called iconv() which does exactly this.

Do I have this figured out correctly?

I suppose if I had a newer version of MySQL with Unicode support, I would be
able to store data from HTML forms directly without the translation. ?

I mostly just need to support foreign characters used in French, German and
Spanish mostly. So, I presume that Latin1 is all I need for now. Also, I'm
storing this data all in the same database.

Monty




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Storing foreign characters in DB

2004-09-06 Thread Chris Blackwell
Not sure whether this is applicable to your version of mysql, or to PHP.
I had the same problem using Macromedia's Coldfusion, and adding this:

useUnicode=truecharacterEncoding=UTF-8

to the db connection string solved the problem

chris

-Original Message-
From: MySQL [mailto:[EMAIL PROTECTED] 
Sent: 05 September 2004 07:06
To: MySQL
Subject: Storing foreign characters in DB

I'm having a problem figuring out how to deal with foreign characters in
text that was copied from an MS Word document and pasted into a form field,
then stored in a MySQL DB. (I have MySQL 3.23.58 running).

I'm not how sure how these characters are being stored in the MySQL
database, but, when I retrieve the text and run it through PHP's
htmlentities() function, each foreign character is converted into 2 other
foreign characters that don't at all represent the original.

For example, a lowercase u with an umlat over it (ü) is somehow displayed as
an uppercase A with an umlat over it followed by the 1/4 symbol after parsed
by htmlentities(). A lowercase o with an ulmat displays as an uppercase A
with an umlat over it followed by the paragraph symbol. It seems that the
uppercase A w/umlat is a constant, and the next character changes.

How are these foreign characters being stored in the DB? Do I need to do
something in order to store these characters properly, or is this something
I need to somehow do on the PHP side of things??

Thanks!

Monty.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Storing foreign characters in DB

2004-09-05 Thread Rhino

- Original Message - 
From: MySQL [EMAIL PROTECTED]
To: MySQL [EMAIL PROTECTED]
Sent: Sunday, September 05, 2004 2:05 AM
Subject: Storing foreign characters in DB


I'm having a problem figuring out how to deal with foreign characters in
text that was copied from an MS Word document and pasted into a form field,
then stored in a MySQL DB. (I have MySQL 3.23.58 running).

I'm not how sure how these characters are being stored in the MySQL
database, but, when I retrieve the text and run it through PHP's
htmlentities() function, each foreign character is converted into 2 other
foreign characters that don't at all represent the original.

For example, a lowercase u with an umlat over it (ü) is somehow displayed as
an uppercase A with an umlat over it followed by the 1/4 symbol after parsed
by htmlentities(). A lowercase o with an ulmat displays as an uppercase A
with an umlat over it followed by the paragraph symbol. It seems that the
uppercase A w/umlat is a constant, and the next character changes.

How are these foreign characters being stored in the DB? Do I need to do
something in order to store these characters properly, or is this something
I need to somehow do on the PHP side of things??


--

Have you looked at the manual, particularly the Localization and
International Usage section at
http://dev.mysql.com/doc/mysql/en/Localisation.html?

Have you tried the MySQL archives (http://lists.mysql.com/) where this
matter has been discussed?

Rhino


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Storing foreign characters in DB

2004-09-05 Thread valentin_nils
Hello Rhino,
(B
(BI am not sure how familiar you are with the character set settings.
(B
(BI am not using 3.xx versions anymore, but if I remember correctly they
(Bonly allow a limited control for setting character sets.
(B
(BYour best guess might be the documentation that comes with your installation.
(B
(B(Man pages, Info, Chapter 11 of the MySQL docs. etc.) You can view it also
(Bonline
(B
(Bhttp://dev.mysql.com/doc/mysql/en/Charset.html
(B
(Bbut as this is the newest version, you will have to filter out what doesnt
(Bapply to your version yet.
(B
(BIn order to be able to give you any advice you need to be more specific.
(B
(BWhich character sets are set now for the server and the client as default ?
(BAre the "foreign characters" in the same character set ?
(BAre you attempting to store the "foreign character" together with your
(Bdefault characters in the same table / same DB ?
(B
(BYou may also read through my presentation "Using MySQL in a Japanese
(Benvironment" which can be found here
(B
(Bhttp://www.be-known-online.com/mysql
(B
(BI am not sure if this really helps you, as I dont yet fully understand
(Bwere to start trouble shooting.
(B
(BIf you could be a bit more specific that would help a lot.
(B
(B
(B
(Bbest regards
(B
(BNils Valentin
(BTokyo/Japan
(B
(B
(B - Original Message -
(B From: "MySQL" [EMAIL PROTECTED]
(B To: "MySQL" [EMAIL PROTECTED]
(B Sent: Sunday, September 05, 2004 2:05 AM
(B Subject: Storing foreign characters in DB
(B
(B
(B I'm having a problem figuring out how to deal with foreign characters in
(B text that was copied from an MS Word document and pasted into a form
(B field,
(B then stored in a MySQL DB. (I have MySQL 3.23.58 running).
(B
(B I'm not how sure how these characters are being stored in the MySQL
(B database, but, when I retrieve the text and run it through PHP's
(B htmlentities() function, each foreign character is converted into 2 other
(B foreign characters that don't at all represent the original.
(B
(B For example, a lowercase u with an umlat over it (? is somehow displayed
(B as
(B an uppercase A with an umlat over it followed by the 1/4 symbol after
(B parsed
(B by htmlentities(). A lowercase o with an ulmat displays as an uppercase A
(B with an umlat over it followed by the paragraph symbol. It seems that the
(B uppercase A w/umlat is a constant, and the next character changes.
(B
(B How are these foreign characters being stored in the DB? Do I need to do
(B something in order to store these characters properly, or is this
(B something
(B I need to somehow do on the PHP side of things??
(B
(B 
(B --
(B
(B Have you looked at the manual, particularly the Localization and
(B International Usage section at
(B http://dev.mysql.com/doc/mysql/en/Localisation.html?
(B
(B Have you tried the MySQL archives (http://lists.mysql.com/) where this
(B matter has been discussed?
(B
(B Rhino
(B
(B
(B --
(B MySQL General Mailing List
(B For list archives: http://lists.mysql.com/mysql
(B To unsubscribe:
(B http://lists.mysql.com/[EMAIL PROTECTED]
(B
(B
(B
(B
(B-- 
(BMySQL General Mailing List
(BFor list archives: http://lists.mysql.com/mysql
(BTo unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]