ID: 19690
User updated by: [EMAIL PROTECTED]
Reported By: [EMAIL PROTECTED]
Status: Analyzed
Bug Type: mbstring related
Operating System: Red Hat Linux 7.2
PHP Version: 4.2.3
New Comment:
I agree. Can this be added to the documentation then?
Previous Comments:
------------------------------------------------------------------------
[2002-10-02 06:44:14] [EMAIL PROTECTED]
oops i forgot to change the status.
------------------------------------------------------------------------
[2002-10-02 06:42:38] [EMAIL PROTECTED]
As of current implementation, mb_split() and mb_ereg() take the regex
pattern as extended mode one, in which white spaces, carridge returns,
and line feeds are ignored and any sequences beginning with "#" and
delimitted by "\n" are treated as comments.
So if you would like to use these characters in the pattern, you should
escape them with a backslash '\'.
IMO this implied behaviour is quite confusing, as we are more familiar
with split() and preg_split().
------------------------------------------------------------------------
[2002-10-01 13:09:33] [EMAIL PROTECTED]
confirmed with HEAD.
------------------------------------------------------------------------
[2002-10-01 08:52:38] [EMAIL PROTECTED]
Here are my PHP settings in case you need to see them
Multibyte (Japanese) Support enabled
multibyte regex support enabled
Directive Local Value Master Value
mbstring.detect_order auto auto
mbstring.func_overload 0 0
mbstring.http_input auto auto
mbstring.http_output no value no value
mbstring.internal_encoding EUC-JP EUC-JP
mbstring.substitute_character no value no value
------------------------------------------------------------------------
[2002-10-01 08:44:54] [EMAIL PROTECTED]
The following output and code show that mb_split does not work:
OUTPUT:
REGEX encoding is EUC-JP
encoding is ASCII
v is One two
COUNT: 9
a word: **
a word: **
a word: **
a word: **
a word: **
a word: **
a word: **
a word: **
a word: **
CODE:
$aWords = array();
echo " REGEX encoding is ". mb_regex_encoding()."<BR>";
$v ="One two";
echo "encoding is ".mb_detect_encoding($v)."<BR>";
echo "v is $v <BR>";
$aWords = mb_split(" ",$v);
echo "COUNT: ".count($aWords)."<BR>";
foreach($aWords as $w) {
echo "a word: *$w*<BR>";
}
exit;
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=19690&edit=1