From: ywliu at hotmail dot com
Operating system: linux
PHP version: 4.3.4
PHP Bug Type: *Languages/Translation
Bug description: htmlentities fail to escape BIG5 characters correctly
Description:
------------
In ext/standard/html.c , htmlentities() fails to identify BIG5 Chinese
characters correctly.
I have checked CVS version 1.87, the bug is still there.
Reproduce code:
---------------
In html.c, look for this piece of code :
case cs_big5:
case cs_gb2312:
case cs_big5hkscs:
{
/* check if this is the first of a 2-byte sequence */
if (this_char >= 0xa1 && this_char <= 0xf9) {
/* peek at the next char */
unsigned char next_char = str[pos];
if ((next_char >= 0x40 && next_char <= 0x73) ||(next_char >= 0xa1 &&
next_char <= 0xfe)) {
Expected result:
----------------
In fact, the first byte should be from 0xa1 to 0xfe, and the second byte
should be from 0x40-0x7e and 0xa1-0xfe.
(from page 88, "Understanding Japanese Information Processing" by Ken
Lunde , O'Reilly.)
Actual result:
--------------
So it should be :
if (this_char >= 0xa1 && this_char <= 0xfe) {
and
if ((next_char >= 0x40 && next_char <= 0x7e) ||(next_char >= 0xa1 &&
next_char <= 0xfe)) {
--
Edit bug report at http://bugs.php.net/?id=27505&edit=1
--
Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=27505&r=trysnapshot4
Try a CVS snapshot (php5): http://bugs.php.net/fix.php?id=27505&r=trysnapshot5
Fixed in CVS: http://bugs.php.net/fix.php?id=27505&r=fixedcvs
Fixed in release: http://bugs.php.net/fix.php?id=27505&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=27505&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=27505&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=27505&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=27505&r=support
Expected behavior: http://bugs.php.net/fix.php?id=27505&r=notwrong
Not enough info: http://bugs.php.net/fix.php?id=27505&r=notenoughinfo
Submitted twice: http://bugs.php.net/fix.php?id=27505&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=27505&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=27505&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=27505&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=27505&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=27505&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=27505&r=float