ID: 30435 User updated by: wberg at doce dot ufl dot edu Reported By: wberg at doce dot ufl dot edu Status: Open Bug Type: PCRE related Operating System: Linux 2.4.21-alpha-r4 PHP Version: 4.3.9 New Comment:
Actually, I was wrong. Using POSIX does *not* work. It won't find either... Previous Comments: ------------------------------------------------------------------------ [2004-10-15 15:27:05] wberg at doce dot ufl dot edu Removing the case insensitivity (i) from the pattern match makes no difference in the outcome. See: http://www.kuruvinda.com/test2.php Using POSIX ereg instead does work. See: http://www.kuruvinda.com/test3.php <?php $Pattern = "[[:<:]][a�����]rm[e����][[:>:]]"; $Blazon = "D'argent, au lion d'azur, arm� et lampass� de gueules, (le lion est quelquefois charg� d'une fleur-de-lis d'or, ou d'un �cusson d'or � l'aigle �ploy�e de sable)."; print "Pattern: $Pattern<p>Blazon: $Blazon<p>"; # Should print Found if (eregi($Pattern,$Blazon,$regs)) { print "Found"; } else { print "Not Found"; } $Blazon = "D'argent, au lion d'azur, arm�e et lampass� de gueules, (le lion est quelquefois charg� d'une fleur-de-lis d'or, ou d'un �cusson d'or � l'aigle �ploy�e de sable)."; print "<p>Pattern: $Pattern<p>Blazon 2: $Blazon<p>"; # Should print Not Found if (eregi($Pattern,$Blazon,$regs)) { print "Found"; } else { print "Not Found"; } ?> ------------------------------------------------------------------------ [2004-10-15 15:07:10] wberg at doce dot ufl dot edu "The match does not find "arm�" as a whole word in $Blazon, but does find it in $Blazon2, although it isn't a whole word there." Should be: The match does not find "arm�" as a whole word in Blazon, but does find it in Blazon 2, although it isn't a whole word there. ------------------------------------------------------------------------ [2004-10-14 16:17:32] wberg at doce dot ufl dot edu Description: ------------ I run PHP 4.3.4 locally on Win XP Pro / ISS 5. The code supplied works flawlessly. On the Linux server, however, which currently runs PHP 4.3.9, it seems to ignore or mess up the \b word boundaries. The same thing happened when the server ran PHP 4.3.3RC1. Upgrading to 4.3.9 didn't help. You can see the code in action, where it produces the wrong results, on the Linux server at: http://www.kuruvinda.com/test.php You can get full PHP info() on this server at: http://www.kuruvinda.com/c.php Reproduce code: --------------- <?php $Pattern = "\b[a�����]rm[e����]\b"; $Blazon = "D'argent, au lion d'azur, arm� et lampass� de gueules, (le lion est quelquefois charg� d'une fleur-de-lis d'or, ou d'un �cusson d'or � l'aigle �ploy�e de sable)."; print "Pattern: $Pattern<p>Blazon: $Blazon<p>"; # Should return "Found" if (preg_match("/".$Pattern."/i",$Blazon)) { print "Found"; } else { print "Not Found"; } $Blazon = "D'argent, au lion d'azur, arm�e et lampass� de gueules, (le lion est quelquefois charg� d'une fleur-de-lis d'or, ou d'un �cusson d'or � l'aigle �ploy�e de sable)."; print "<p>Pattern: $Pattern<p>Blazon 2: $Blazon<p>"; # Should return "Not Found" if (preg_match("/".$Pattern."/i",$Blazon)) { print "Found"; } else { print "Not Found"; } ?> Expected result: ---------------- The match should find "arm�" as a whole word only in $Blazon and not in $Blazon2. Actual result: -------------- The match does not find "arm�" as a whole word in $Blazon, but does find it in $Blazon2, although it isn't a whole word there. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=30435&edit=1
