ID:               19690
 User updated by:  [EMAIL PROTECTED]
 Reported By:      [EMAIL PROTECTED]
 Status:           Analyzed
 Bug Type:         mbstring related
 Operating System: Red Hat Linux 7.2
 PHP Version:      4.2.3
 New Comment:

I agree. Can this be added to the documentation then?


Previous Comments:
------------------------------------------------------------------------

[2002-10-02 06:44:14] [EMAIL PROTECTED]

oops i forgot to change the status.

------------------------------------------------------------------------

[2002-10-02 06:42:38] [EMAIL PROTECTED]

As of current implementation, mb_split() and mb_ereg() take the regex
pattern as extended mode one, in which white spaces, carridge returns,
and line feeds are ignored and any sequences beginning with "#" and
delimitted by "\n" are treated as comments.
So if you would like to use these characters in the pattern, you should
escape them with a backslash '\'.

IMO this implied behaviour is quite confusing, as we are more familiar
with split() and preg_split().


------------------------------------------------------------------------

[2002-10-01 13:09:33] [EMAIL PROTECTED]

confirmed with HEAD.

------------------------------------------------------------------------

[2002-10-01 08:52:38] [EMAIL PROTECTED]

Here are my PHP settings in case you need to see them

Multibyte (Japanese) Support enabled
multibyte regex support enabled

Directive                     Local Value Master Value
mbstring.detect_order         auto        auto
mbstring.func_overload        0           0
mbstring.http_input           auto        auto
mbstring.http_output          no value    no value
mbstring.internal_encoding    EUC-JP      EUC-JP
mbstring.substitute_character no value    no value

------------------------------------------------------------------------

[2002-10-01 08:44:54] [EMAIL PROTECTED]

The following output and code show that mb_split does not work:

OUTPUT:

REGEX encoding is EUC-JP
encoding is ASCII
v is One two
COUNT: 9
a word: **
a word: **
a word: **
a word: **
a word: **
a word: **
a word: **
a word: **
a word: **

CODE:

$aWords = array();
echo " REGEX encoding is ". mb_regex_encoding()."<BR>";
$v ="One two";
echo "encoding is ".mb_detect_encoding($v)."<BR>";
echo "v is $v <BR>";
$aWords = mb_split(" ",$v);
echo "COUNT: ".count($aWords)."<BR>";
foreach($aWords as $w) {
  echo "a word: *$w*<BR>";
}
exit;


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=19690&edit=1

Reply via email to