#25669 [Opn-Fbk]: eregi() vs. 8-bit chars in regex

2003-09-29 Thread moriyoshi
 ID:   25669
 Updated by:   [EMAIL PROTECTED]
 Reported By:  svs at ropnet dot ru
-Status:   Open
+Status:   Feedback
 Bug Type: Regexps related
 Operating System: FreeBSD 4.8
 PHP Version:  4.3.3
 New Comment:

Can you try this one again:
http://www.voltex.jp/patches/regpatch.diff

Note: This problem is known to not be reproduced with glibc and
unfortunately I don't have a freebsd box atm.



Previous Comments:


[2003-09-29 08:08:38] [EMAIL PROTECTED]

Ilia: your patch doesn't seem to deal with it correctly, as isalpha()
expects signed integer indeed. A char value can be any of the numbers,
-128 to 127, so if you cast it to unsigned integer, you never got a
value in range of 0 to 255. So you should first cast it to unsigned
char, and then make it signed integer.



[2003-09-29 06:23:14] svs at ropnet dot ru

No, it does not.



[2003-09-28 20:14:05] [EMAIL PROTECTED]

Try this patch and see if it fixes the problem.
http://bb.prohost.org/reg.txt



[2003-09-26 09:34:44] svs at ropnet dot ru

oops, mozilla mangled those characters.

begin 644 l.php
M/#]P:'`*V5T;]C86QE*$Q#7T%,3P@(G)U7U)5+DM/[EMAIL PROTECTED](I.R`*96-H
M;R!S971L;V-A;4H3$-?04Q,+`B(BDL();B([FEF(AEF5G:[EMAIL PROTECTED](L
M(+Q\2(I*2![(5C:\@(F]K7XB.R!](5LV4@R!E8VAO()B861;B([
M?0II9B`H')E9U]M871C:@B+]$O:2(L(+Q\2(I*2![(5C:\@(F]K7XB
=.R!](5LV4@R!E8VAO()B861;B([?0H_/@H`
`
end

'./configure' '--without-x' '--disable-debug'
'--with-apxs=/usr/local/apache/bin/apxs' '--with-mod_charset'
'--enable-dba' '--with-gdbm=/usr/local' '--with-db4=/usr/local'
'--enable-dbase' '--enable-ftp' '--enable-sockets'
'--enable-inline-optimization' '--enable-memory-limit' '--with-mysql'
'--with-gd' '--enable-gd-native-ttf' '--with-zlib=/usr'
'--with-jpeg-dir=/usr/local' '--with-png-dir=/usr/local'
'--with-freetype-dir=/usr/local' '--enable-exif' '--enable-calendar'
'--enable-wddx' '--with-gmp' '--with-openssl=/usr'
'--with-iconv=/usr/local' '--with-imap=shared,/usr/local'
'--with-curl=/usr/local' '--with-dom=shared,/usr/local'
'--with-dom-xslt=shared,/usr/local'
'--with-dom-exslt=shared,/usr/local' '--enable-xslt=shared'
'--with-xslt-sablot=shared,/usr/local' '--with-iconv-dir=/usr/local'
'--with-expat-dir=/usr/local' '--with-zip=/usr/local' '--with-pdflib'
'--with-tiff-dir=/usr/local'



[2003-09-26 09:16:38] [EMAIL PROTECTED]

And what was the configure line used to configure PHP?




The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
http://bugs.php.net/25669

-- 
Edit this bug report at http://bugs.php.net/?id=25669edit=1


#25669 [Opn-Fbk]: eregi() vs. 8-bit chars in regex

2003-09-28 Thread iliaa
 ID:   25669
 Updated by:   [EMAIL PROTECTED]
 Reported By:  svs at ropnet dot ru
-Status:   Open
+Status:   Feedback
 Bug Type: Regexps related
 Operating System: FreeBSD 4.8
 PHP Version:  4.3.3
 New Comment:

Try this patch and see if it fixes the problem.
http://bb.prohost.org/reg.txt


Previous Comments:


[2003-09-26 09:34:44] svs at ropnet dot ru

oops, mozilla mangled those characters.

begin 644 l.php
M/#]P:'`*V5T;]C86QE*$Q#7T%,3P@(G)U7U)5+DM/[EMAIL PROTECTED](I.R`*96-H
M;R!S971L;V-A;4H3$-?04Q,+`B(BDL();B([FEF(AEF5G:[EMAIL PROTECTED](L
M(+Q\2(I*2![(5C:\@(F]K7XB.R!](5LV4@R!E8VAO()B861;B([
M?0II9B`H')E9U]M871C:@B+]$O:2(L(+Q\2(I*2![(5C:\@(F]K7XB
=.R!](5LV4@R!E8VAO()B861;B([?0H_/@H`
`
end

'./configure' '--without-x' '--disable-debug'
'--with-apxs=/usr/local/apache/bin/apxs' '--with-mod_charset'
'--enable-dba' '--with-gdbm=/usr/local' '--with-db4=/usr/local'
'--enable-dbase' '--enable-ftp' '--enable-sockets'
'--enable-inline-optimization' '--enable-memory-limit' '--with-mysql'
'--with-gd' '--enable-gd-native-ttf' '--with-zlib=/usr'
'--with-jpeg-dir=/usr/local' '--with-png-dir=/usr/local'
'--with-freetype-dir=/usr/local' '--enable-exif' '--enable-calendar'
'--enable-wddx' '--with-gmp' '--with-openssl=/usr'
'--with-iconv=/usr/local' '--with-imap=shared,/usr/local'
'--with-curl=/usr/local' '--with-dom=shared,/usr/local'
'--with-dom-xslt=shared,/usr/local'
'--with-dom-exslt=shared,/usr/local' '--enable-xslt=shared'
'--with-xslt-sablot=shared,/usr/local' '--with-iconv-dir=/usr/local'
'--with-expat-dir=/usr/local' '--with-zip=/usr/local' '--with-pdflib'
'--with-tiff-dir=/usr/local'



[2003-09-26 09:16:38] [EMAIL PROTECTED]

And what was the configure line used to configure PHP?




[2003-09-26 09:13:01] [EMAIL PROTECTED]

I don't think you meant to use those chars in your example
script..? Can you please add the actual ones here?




[2003-09-26 08:20:57] svs at ropnet dot ru

Description:

Even though locale is set up correctly, eregi() fails to match
international characters case-insensitively.  The reason, as far
as I understand, is that code in regex/ passes a negative value to
isalpha(). This can be worked around by recompiling regex/regcomp.c
manually with -funsigned-char (assuming GCC is the compiler).


Reproduce code:
---
?php
setlocale(LC_ALL, ru_RU.KOI8-R); 
echo setlocale(LC_ALL, ), \n;
if (eregi(#1103;, #1071;#1071;)) { echo ok\n; } else { echo
bad\n;}
if (preg_match(/#1103;/i, #1071;#1071;)) { echo ok\n; } else {
echo bad\n;}
?


Expected result:

ru_RU.KOI8-R
ok
ok


Actual result:
--
ru_RU.KOI8-R
bad
ok






-- 
Edit this bug report at http://bugs.php.net/?id=25669edit=1


#25669 [Opn-Fbk]: eregi() vs. 8-bit chars in regex

2003-09-26 Thread sniper
 ID:   25669
 Updated by:   [EMAIL PROTECTED]
 Reported By:  svs at ropnet dot ru
-Status:   Open
+Status:   Feedback
 Bug Type: Regexps related
 Operating System: FreeBSD 4.8
 PHP Version:  4.3.3
 New Comment:

I don't think you meant to use those chars in your example
script..? Can you please add the actual ones here?



Previous Comments:


[2003-09-26 08:20:57] svs at ropnet dot ru

Description:

Even though locale is set up correctly, eregi() fails to match
international characters case-insensitively.  The reason, as far
as I understand, is that code in regex/ passes a negative value to
isalpha(). This can be worked around by recompiling regex/regcomp.c
manually with -funsigned-char (assuming GCC is the compiler).


Reproduce code:
---
?php
setlocale(LC_ALL, ru_RU.KOI8-R); 
echo setlocale(LC_ALL, ), \n;
if (eregi(#1103;, #1071;#1071;)) { echo ok\n; } else { echo
bad\n;}
if (preg_match(/#1103;/i, #1071;#1071;)) { echo ok\n; } else {
echo bad\n;}
?


Expected result:

ru_RU.KOI8-R
ok
ok


Actual result:
--
ru_RU.KOI8-R
bad
ok






-- 
Edit this bug report at http://bugs.php.net/?id=25669edit=1