#25669 [Opn-Fbk]: eregi() vs. 8-bit chars in regex
ID: 25669 Updated by: [EMAIL PROTECTED] Reported By: svs at ropnet dot ru -Status: Open +Status: Feedback Bug Type: Regexps related Operating System: FreeBSD 4.8 PHP Version: 4.3.3 New Comment: Can you try this one again: http://www.voltex.jp/patches/regpatch.diff Note: This problem is known to not be reproduced with glibc and unfortunately I don't have a freebsd box atm. Previous Comments: [2003-09-29 08:08:38] [EMAIL PROTECTED] Ilia: your patch doesn't seem to deal with it correctly, as isalpha() expects signed integer indeed. A char value can be any of the numbers, -128 to 127, so if you cast it to unsigned integer, you never got a value in range of 0 to 255. So you should first cast it to unsigned char, and then make it signed integer. [2003-09-29 06:23:14] svs at ropnet dot ru No, it does not. [2003-09-28 20:14:05] [EMAIL PROTECTED] Try this patch and see if it fixes the problem. http://bb.prohost.org/reg.txt [2003-09-26 09:34:44] svs at ropnet dot ru oops, mozilla mangled those characters. begin 644 l.php M/#]P:'`*V5T;]C86QE*$Q#7T%,3P@(G)U7U)5+DM/[EMAIL PROTECTED](I.R`*96-H M;R!S971L;V-A;4H3$-?04Q,+`B(BDL();B([FEF(AEF5G:[EMAIL PROTECTED](L M(+Q\2(I*2![(5C:\@(F]K7XB.R!](5LV4@R!E8VAO()B861;B([ M?0II9B`H')E9U]M871C:@B+]$O:2(L(+Q\2(I*2![(5C:\@(F]K7XB =.R!](5LV4@R!E8VAO()B861;B([?0H_/@H` ` end './configure' '--without-x' '--disable-debug' '--with-apxs=/usr/local/apache/bin/apxs' '--with-mod_charset' '--enable-dba' '--with-gdbm=/usr/local' '--with-db4=/usr/local' '--enable-dbase' '--enable-ftp' '--enable-sockets' '--enable-inline-optimization' '--enable-memory-limit' '--with-mysql' '--with-gd' '--enable-gd-native-ttf' '--with-zlib=/usr' '--with-jpeg-dir=/usr/local' '--with-png-dir=/usr/local' '--with-freetype-dir=/usr/local' '--enable-exif' '--enable-calendar' '--enable-wddx' '--with-gmp' '--with-openssl=/usr' '--with-iconv=/usr/local' '--with-imap=shared,/usr/local' '--with-curl=/usr/local' '--with-dom=shared,/usr/local' '--with-dom-xslt=shared,/usr/local' '--with-dom-exslt=shared,/usr/local' '--enable-xslt=shared' '--with-xslt-sablot=shared,/usr/local' '--with-iconv-dir=/usr/local' '--with-expat-dir=/usr/local' '--with-zip=/usr/local' '--with-pdflib' '--with-tiff-dir=/usr/local' [2003-09-26 09:16:38] [EMAIL PROTECTED] And what was the configure line used to configure PHP? The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/25669 -- Edit this bug report at http://bugs.php.net/?id=25669edit=1
#25669 [Opn-Fbk]: eregi() vs. 8-bit chars in regex
ID: 25669 Updated by: [EMAIL PROTECTED] Reported By: svs at ropnet dot ru -Status: Open +Status: Feedback Bug Type: Regexps related Operating System: FreeBSD 4.8 PHP Version: 4.3.3 New Comment: Try this patch and see if it fixes the problem. http://bb.prohost.org/reg.txt Previous Comments: [2003-09-26 09:34:44] svs at ropnet dot ru oops, mozilla mangled those characters. begin 644 l.php M/#]P:'`*V5T;]C86QE*$Q#7T%,3P@(G)U7U)5+DM/[EMAIL PROTECTED](I.R`*96-H M;R!S971L;V-A;4H3$-?04Q,+`B(BDL();B([FEF(AEF5G:[EMAIL PROTECTED](L M(+Q\2(I*2![(5C:\@(F]K7XB.R!](5LV4@R!E8VAO()B861;B([ M?0II9B`H')E9U]M871C:@B+]$O:2(L(+Q\2(I*2![(5C:\@(F]K7XB =.R!](5LV4@R!E8VAO()B861;B([?0H_/@H` ` end './configure' '--without-x' '--disable-debug' '--with-apxs=/usr/local/apache/bin/apxs' '--with-mod_charset' '--enable-dba' '--with-gdbm=/usr/local' '--with-db4=/usr/local' '--enable-dbase' '--enable-ftp' '--enable-sockets' '--enable-inline-optimization' '--enable-memory-limit' '--with-mysql' '--with-gd' '--enable-gd-native-ttf' '--with-zlib=/usr' '--with-jpeg-dir=/usr/local' '--with-png-dir=/usr/local' '--with-freetype-dir=/usr/local' '--enable-exif' '--enable-calendar' '--enable-wddx' '--with-gmp' '--with-openssl=/usr' '--with-iconv=/usr/local' '--with-imap=shared,/usr/local' '--with-curl=/usr/local' '--with-dom=shared,/usr/local' '--with-dom-xslt=shared,/usr/local' '--with-dom-exslt=shared,/usr/local' '--enable-xslt=shared' '--with-xslt-sablot=shared,/usr/local' '--with-iconv-dir=/usr/local' '--with-expat-dir=/usr/local' '--with-zip=/usr/local' '--with-pdflib' '--with-tiff-dir=/usr/local' [2003-09-26 09:16:38] [EMAIL PROTECTED] And what was the configure line used to configure PHP? [2003-09-26 09:13:01] [EMAIL PROTECTED] I don't think you meant to use those chars in your example script..? Can you please add the actual ones here? [2003-09-26 08:20:57] svs at ropnet dot ru Description: Even though locale is set up correctly, eregi() fails to match international characters case-insensitively. The reason, as far as I understand, is that code in regex/ passes a negative value to isalpha(). This can be worked around by recompiling regex/regcomp.c manually with -funsigned-char (assuming GCC is the compiler). Reproduce code: --- ?php setlocale(LC_ALL, ru_RU.KOI8-R); echo setlocale(LC_ALL, ), \n; if (eregi(#1103;, #1071;#1071;)) { echo ok\n; } else { echo bad\n;} if (preg_match(/#1103;/i, #1071;#1071;)) { echo ok\n; } else { echo bad\n;} ? Expected result: ru_RU.KOI8-R ok ok Actual result: -- ru_RU.KOI8-R bad ok -- Edit this bug report at http://bugs.php.net/?id=25669edit=1
#25669 [Opn-Fbk]: eregi() vs. 8-bit chars in regex
ID: 25669 Updated by: [EMAIL PROTECTED] Reported By: svs at ropnet dot ru -Status: Open +Status: Feedback Bug Type: Regexps related Operating System: FreeBSD 4.8 PHP Version: 4.3.3 New Comment: I don't think you meant to use those chars in your example script..? Can you please add the actual ones here? Previous Comments: [2003-09-26 08:20:57] svs at ropnet dot ru Description: Even though locale is set up correctly, eregi() fails to match international characters case-insensitively. The reason, as far as I understand, is that code in regex/ passes a negative value to isalpha(). This can be worked around by recompiling regex/regcomp.c manually with -funsigned-char (assuming GCC is the compiler). Reproduce code: --- ?php setlocale(LC_ALL, ru_RU.KOI8-R); echo setlocale(LC_ALL, ), \n; if (eregi(#1103;, #1071;#1071;)) { echo ok\n; } else { echo bad\n;} if (preg_match(/#1103;/i, #1071;#1071;)) { echo ok\n; } else { echo bad\n;} ? Expected result: ru_RU.KOI8-R ok ok Actual result: -- ru_RU.KOI8-R bad ok -- Edit this bug report at http://bugs.php.net/?id=25669edit=1