[PHP] Re: PCRE regex result is different between Linux & Windows.
ClapClap <2000ans free.fr> writes: > > ClapClap <2000ans free.fr> writes: > > > My versions of PCRE : > > - Linux 7.4 2007-09-21 (PHP 5.2.4-2ubuntu5.3) > > - Windows XP 7.2 2007-06-19 (PHP 5.2.4) > > > > And : > - Windows 20007.6 2008-01-28 (PHP 5.2.6) > > It works fine under Windows with PCRE 7.2 & 7.6. > I do not know why > Again and again... I've found the mistake. It's the string encoding conversion to UTF-8 which make regex differences. On Linux, it uses Glibc 2.7 while on Windows, it's libiconv 1.11. According to PHP manual http://docs.php.net/manual/en/intro.iconv.php : We have to use libiconv to play with encoding, it's better than Glibc. Damned ! -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: PCRE regex result is different between Linux & Windows.
ClapClap <2000ans free.fr> writes: > My versions of PCRE : > - Linux 7.4 2007-09-21 (PHP 5.2.4-2ubuntu5.3) > - Windows XP 7.2 2007-06-19 (PHP 5.2.4) > And : - Windows 2000 7.6 2008-01-28 (PHP 5.2.6) It works fine under Windows with PCRE 7.2 & 7.6. I do not know why -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: PCRE regex result is different between Linux & Windows.
Lupus Michaelis a écrit : ClapClap a écrit : For the PCRE version, I really can not tell you which one I use... Where can I see that ? In the the output from phpinfo function. Thanks. My versions of PCRE : - Linux 7.4 2007-09-21 (PHP 5.2.4-2ubuntu5.3) - Windows XP 7.2 2007-06-19 (PHP 5.2.4) So, It may be a bug ? Too bad... I remember some change behavior recently in PCRE. But I am not sure, I want just to drop this option first. Yeah ! It's probably that. I hope... Too bad. -- Julien -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: PCRE regex result is different between Linux & Windows.
ClapClap a écrit : For the PCRE version, I really can not tell you which one I use... Where can I see that ? In the the output from phpinfo function. So, It may be a bug ? Too bad... I remember some change behavior recently in PCRE. But I am not sure, I want just to drop this option first. -- Mickaël Wolff aka Lupus Michaelis http://lupusmic.org -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: PCRE regex result is different between Linux & Windows.
Jochem Maas a écrit : >> >> [I precede you, sorry for language mistakes...] > > php or english? :-) > ohhh... sh.. ! I think I speak PHP better than english (silly, not ?). > > okay, are you using the same PHP version on both machines? > anything in the php.ini's that differs? > The same, not possible (Windows/Linux). For php.ini, quite the same (some directories are different). Under Windows (PHP 5.1.6 on 2k SP4 / 5.2.4 on XP SP2, the officials). Under Linux (Ubuntu 8.04) 5.2.4-2ubuntu5.3. > are you possibly looking at an input/file character-set encoding related > issue? (i.e. encoding is different between the two servers)? > All PHP source is written in UTF-8. I take the HTML code and convert it to UTF-8 using iconv() / mbstring... > can you post a short complete script to see if others can reproduce the > error? > See the following link for the bogus test (Must match : windows = 90, linux = 54): http://pastebin.com/m1c43cc10 The same results are given when : - comments are removed - with 'm' or 's' PCRE options - recursion is removed (multiple parses in while statement (matches for each pass : 55, 26, 5, 2)) This snippet is used in a part of code which goal is to convert HTML from Word 2003 to valid XHTML. But that is not the subject... For the PCRE version, I really can not tell you which one I use... Where can I see that ? So, It may be a bug ? Too bad... > > have you tried to use the Tidy extension to clean up the input string?, > it has alsorts of wonderful settings for making (x)HTML nice an shiny. > You think I have already tried it. ;-) Tidy is too agressive for parsing HTML from MS Office... Hope it will work :-/ -- Julien -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php