ID: 28494
Updated by: [EMAIL PROTECTED]
Reported By: david at collectfair dot co dot uk
-Status: Open
+Status: Bogus
-Bug Type: *Regular Expressions
+Bug Type: *Languages/Translation
Operating System: Linux
PHP Version: 4.3.6
New Comment:
1. You cannot use utf8_decode() for generic UTF-8
decoding. It was designed specifically to use with ISO
-8859-1 characters. Consider using mb_convert_encoding()
or iconv() instead.
2. ereg() neither supports UTF-8. Try using
preg_match().
Previous Comments:
------------------------------------------------------------------------
[2004-05-23 14:40:08] david at collectfair dot co dot uk
Further testing shows that failure only occurs when the %f0 is at the
GET string
Changing to this Perl based regular expression removes the problem
completely.
preg_match('/^\d{1,8}$/', utf8_decode($_GET['tt']));
------------------------------------------------------------------------
[2004-05-23 01:59:20] david at collectfair dot co dot uk
Description:
------------
ereg wrongly indicates that a GET string contains only integers if the
string is first processed using utf8_decode.
Example: The regular expression should return true only if the variable
'tt' in the GET string contains between 1 and 8 number characters.
This behaves correctly:
ereg('^[0-9]{1,8}$',$_GET['tt']))
This behaves wrongly:
ereg('^[0-9]{1,8}$',utf8_decode($_GET['tt'])))
---------------------------------------------------
PHP Configure line - './configure'
'--with-apxs2=/usr/local/apache/bin/apxs'
'--with-mysql=/usr/local/mysql/' '--with-mysql-sock=/tmp/mysql.sock'
'--enable-exif' '--with-gd' '--with-jpeg-dir=/usr/lib'
'--with-png-dir=/usr/lib' '--with-zlib-dir=/usr/lib'
'--with-xpm-dir=/usr/lib' '--with-freetype-dir=/usr/lib'
'--with-t1lib=/usr/lib' '--with-freetype-dir=/usr/lib'
'--disable-debug' '--with-config-file-path=/etc/httpd'
'--with-openssl=/usr/local/ssl' '--enable-memory-limit'
----------------------------------------------------
Apache 2.0.49
core mod_access mod_auth mod_include mod_log_config mod_env mod_headers
mod_setenvif mod_ssl prefork http_core mod_mime mod_status
mod_autoindex mod_asis mod_cgi mod_negotiation mod_dir mod_imap
mod_actions mod_alias mod_so mod_expires mod_rewrite mod_deflate
mod_logio sapi_apache2 mod_security
Reproduce code:
---------------
Call this script with the following line in the browser:
http://localhost/test.php?tt=119%f0
<?php
if(isset($_GET['tt']) &&
eregi('^[0-9]{1,8}$',utf8_decode($_GET['tt']))){
//Should only get here if 'tt' is an integer
echo 'Integer';
}else{
echo 'Not an integer';
}
?>
Expected result:
----------------
The script should return "Not an integer" in the browser.
Actual result:
--------------
The script returns "Integer" in the browser, even though the GET string
contains other characters.
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=28494&edit=1