Edit report at https://bugs.php.net/bug.php?id=62562&edit=1

 ID:                 62562
 User updated by:    magog dot the dot ogre at gmail dot com
 Reported by:        magog dot the dot ogre at gmail dot com
 Summary:            preg_replace mangles UTF8 string - Windows only
 Status:             Open
 Type:               Bug
 Package:            *Regular Expressions
 Operating System:   Windows x86
 PHP Version:        5.3.14
 Block user comment: N
 Private report:     N

 New Comment:

Yeah, it works SunOS and Ubuntu for me too.

Well if/when you get access to a Windows distro or another developer who has 
one comes along, then I guess you can work on this bug. :)


Previous Comments:
------------------------------------------------------------------------
[2012-07-15 22:43:01] ras...@php.net

Well, I have looked at the code. We take the raw binary string and pass it 
straight to PCRE both on Windows and UNIX. So something along the way isn't the 
same. But I am not a Windows guy, so I can't help you on the Windows side of 
things. It works fine on my Linux box here.

------------------------------------------------------------------------
[2012-07-15 22:32:03] magog dot the dot ogre at gmail dot com

OK then, after doing some more plugging around, it appears that it still might 
be a PHP issue. Correct me if I'm wrong, but here are my finding:

Create a php file with only the following content:
  <?php
  echo preg_match("/\s+/", "ინფორმაცია")?"1":"0";

Running this on Windows will return "1", running on Unix returns "0".

Now I've run this on PCRE, and PCRE has returned that there was no match. Thus, 
it may be a PHP issue. Here is the output:
***Contents of test.txt
/\s+/
ინფორმაცია
ინფორ მაცია

***Output via Cygwin, running the Windows native pcretest.exe
(redacted)@(redacted)-PC /cygdrive/c/Program Files (x86)/pcre-7.0-bin/bin
$ ./pcretest.exe test.txt
PCRE version 7.0 18-Dec-2006

/\s+/
ინფორმაცია
No match
ინფორ მაცია
 0:

(I included the second example above with a space purposefully added, just to 
show that the tool is functioning properly and will catch the space when it's 
properly there).

------------------------------------------------------------------------
[2012-07-15 21:48:18] ras...@php.net

No, PCRE is a Perl-Compatible-Regex library but it is not the code used by Perl 
itself. Many (most?) open source things that have regex support will use PCRE.

------------------------------------------------------------------------
[2012-07-15 19:19:03] magog dot the dot ogre at gmail dot com

I have Perl itself installed; do they use PCRE? Sorry for my n00b questions. If 
so, I will run a test on there shortly.

------------------------------------------------------------------------
[2012-07-14 03:12:27] ras...@php.net

hrm.. how about finding something else that links against pcre and runs on 
Windows that might be able to do a replace? Like Python perhaps?
I still doubt this has anything to do with PHP. We don't mangle anything going 
in 
nor out of pcre.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=62562


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=62562&edit=1

Reply via email to