Edit report at http://bugs.php.net/bug.php?id=52592&edit=1

 ID:                 52592
 Updated by:         ahar...@php.net
 Reported by:        pj at ezgr dot net
 Summary:            mb_ereg_replace and the Greek capital Pi
-Status:             Open
+Status:             Bogus
 Type:               Bug
 Package:            mbstring related
 Operating System:   Centos 5.5 x64
 PHP Version:        5.2.14
 Block user comment: N

 New Comment:

You need to also call mb_regex_encoding('UTF-8'); before using a UTF-8
regular expression.


Previous Comments:
------------------------------------------------------------------------
[2010-08-12 14:36:15] pj at ezgr dot net

Description:
------------
PHP: 5.2.14, Apache 2.2.15, mod_php



While \s is supposed to match all whitespace, the greek unicode letter
Pi (Π) whose code is 0xCEA0 is matched too and if replaced with
something, it's stripped of its second byte (0xA0).

Test script:
---------------
<?php

mb_internal_encoding('UTF-8');



$testStr = 'Π  Π  Π!';

$newStr = mb_ereg_replace('\s+','_',$testStr);

echo $testStr;

echo $newStr;

echo urlencode($testStr);

echo urlencode($newStr);

?>

Expected result:
----------------
Π  Π  Π!

Π__Π__Π!

%CE%A0++%CE%A0++%CE%A0%21

%CE%A0__%CE%A0__%CE%A0%21

Actual result:
--------------
Π  Π  Π!

[non printable character]_[non printable character]_[non printable
character]!

%CE%A0++%CE%A0++%CE%A0%21

%CE_%CE_%CE_%21


------------------------------------------------------------------------



-- 
Edit this bug report at http://bugs.php.net/bug.php?id=52592&edit=1

Reply via email to