From: colourmusic at gmail dot com
Operating system: Win XP SP2
PHP version: 6CVS-2008-04-30 (snap)
PHP Bug Type: *Unicode Issues
Bug description: Replaces UTF-8 symbol with incorrect symbol
Description:
------------
I parsed url with UTF-8 encoding and noticed that UTF symbol 8 ( 8
= EF BC 98 code units) replaces to EF BC 5F code units that are not correct
utf symbol.
Script didn't generate errors and warnings.
Also I noticed that utf symbols from 0 (0) to 7 (7) and 9
(9) parses by parse_url() without any problems.
This bug also appears on PHP 5.2.3 and PHP 5.2.5
Reproduce code:
---------------
<?php
// mb_convert_encoding() provides same result as html_entity_decode() in
this example
//$url =
mb_convert_encoding("https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465,",
"utf-8", "html-entities");
$url =
html_entity_decode("https://example.com/?SHAMEI=ランドクルーザー90バン&SHAMEI_CD=01465,",null,"utf-8");
echo "Original URL = $url <br />\n";
$result = parse_url($url);
echo print_r($result);
?>
Expected result:
----------------
Original URL =
https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465,
Array
(
[scheme] => https
[host] => example.com
[path] => /
[query] =>
SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465,
)
Actual result:
--------------
Original URL =
https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465,
Array
(
[scheme] => https
[host] => example.com
[path] => /
[query] =>
ランドクルーザー�_0バン&SHAMEI_CD=01465,
--
Edit bug report at http://bugs.php.net/?id=44868&edit=1
--
Try a CVS snapshot (PHP 5.2):
http://bugs.php.net/fix.php?id=44868&r=trysnapshot52
Try a CVS snapshot (PHP 5.3):
http://bugs.php.net/fix.php?id=44868&r=trysnapshot53
Try a CVS snapshot (PHP 6.0):
http://bugs.php.net/fix.php?id=44868&r=trysnapshot60
Fixed in CVS: http://bugs.php.net/fix.php?id=44868&r=fixedcvs
Fixed in release:
http://bugs.php.net/fix.php?id=44868&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=44868&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=44868&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=44868&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=44868&r=support
Expected behavior: http://bugs.php.net/fix.php?id=44868&r=notwrong
Not enough info:
http://bugs.php.net/fix.php?id=44868&r=notenoughinfo
Submitted twice:
http://bugs.php.net/fix.php?id=44868&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=44868&r=globals
PHP 4 support discontinued: http://bugs.php.net/fix.php?id=44868&r=php4
Daylight Savings: http://bugs.php.net/fix.php?id=44868&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=44868&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=44868&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=44868&r=float
No Zend Extensions: http://bugs.php.net/fix.php?id=44868&r=nozend
MySQL Configuration Error: http://bugs.php.net/fix.php?id=44868&r=mysqlcfg