ID: 24218
Updated by: [EMAIL PROTECTED]
Reported By: mark at lange dot demon dot co dot uk
Status: Feedback
Bug Type: mbstring related
Operating System: Win32
PHP Version: 4.3.2
New Comment:
That looks more like the browser issue.
Probably you have to explicitly indicate the charset(encoding) of the
page content being dispatched to the browser either by <META> tag or by
"Content-Type" header;
-- example 1 --
<?php
$output_charset = "UTF-8";
header("Content-Type: text/html; charset=$output_charset");
?>
<html>
<body>
<?php
$iso_8859_1 = "Fran�ais";
print mb_convert_encoding($iso_8859_1, $output_charset, "iso-8859-1");
?>
</body>
</html>
-- example 2 --
<?php
$output_charset = "UTF-8";
?>
<html>
<head>
<?php print "<meta http-equiv=\"Content-Type\" content=\"text/html;
charset=$output_charset\">"; ?>
</head>
<body>
<?php
$iso_8859_1 = "Fran�ais";
print mb_convert_encoding($iso_8859_1, $output_charset, "iso-8859-1");
?>
</body>
</html>
Previous Comments:
------------------------------------------------------------------------
[2003-06-17 05:42:21] [EMAIL PROTECTED]
Please provide a complete but short example script.
------------------------------------------------------------------------
[2003-06-17 02:55:00] mark at lange dot demon dot co dot uk
Description:
------------
A piece of code that was working perfectly correctly with PHP version
4.2.3 is now displaying erroneous characters for PHP 4.3.2.
The script in question is a translation form, displaying two languages
which may use different charsets.
The code determines the appropriate charset to use for the html header
from the two languages. If the charsets are the same, no problem. If
they are different, it tests whther the mbstring module is enabled, in
which case it uses 'UTF-8' for the html header; otherwise it uses the
charset for the second language.
If mbstring was enabled, then the code uses the mb_convert_encoding
function to convert the text strings for display to UTF-8...
Reproduce code:
---------------
$basecharset = 'ISO-8859-1';
$charset = 'ISO-8859-2';
$convertcharsets = ($basecharset != $charset);
if ($convertcharsets) {
if (function_exists('mb_convert_encoding')) {
formheader('UTF-8');
} else {
$convertcharsets = false;
formheader($charset);
}
} else {
formheader($basecharset);
}
echo charsetText('Fran�ais',$convertcharset,$basecharset)
echo '<br />';
echo charsetText('Polska',$convertcharset,$charset)
echo '<br />';
function charsetText($text,$convertcharset,$fromcharset)
{
$returntext = $text;
if ($convertcharset) { $returntext =
mb_convert_encoding($returntext,"UTF-8",$fromcharset); }
return $returntext;
} // function charsetText()
Expected result:
----------------
Fran�ais
Polska
Actual result:
--------------
Fran�§ais
Polska
The first odd character in FranXXais is A with a tilde; the second is
the HTML § character
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=24218&edit=1