From: php-bug-NOSPAM-2004 at ryandesign dot com
Operating system: Mac OS X; FreeBSD; RedHet Linux
PHP version: 4.3.4
PHP Bug Type: Unknown/Other Function
Bug description: get_browser matches browscap.ini patterns incorrectly
Description:
------------
PHP's get_browser() function does not correctly use the
patterns in the browscap.ini file, resulting in
occasional incorrect matches. This occurred, for
example, when Apple released Safari 1.2, and when
OmniGroup released OmniWeb 5.0b1. These two browsers
were then incorrectly identified as crawlers / robots,
instead of being recognized as normal browsers.
Instead of matching the last rule in the file (which has
the browscap pattern "*" which PHP translates into the
regular expression ".*"), it matches the rule for
Website Strippers (which has the browscap pattern
"Mozilla/5.0" which PHP translates to the regular
expression "Mozilla/5\.0"). Yes, Safari and OmniWeb have
"Mozilla/5.0" as part of their user agent string, but
only part. "Mozilla/5.0" is not the ENTIRE UA string,
which is what the browscap pattern is intending to
define. Had the rule been intended to match "Mozilla/
5.0" at the start of the string, regardless of what
followed, the rule would have been written "Mozilla/
5.0*". But it wasn't. PHP needs to anchor the regular
expression it generates to the beginning and end of the
string to ensure it is matching the portion of the
string the browscap.ini author intended it to match. The
regular expressions PHP should have generated are
"^Mozilla/5\.0$" and "^.*$".
Here is a diff of the PHP source code file
ext/standard/browscap.c (from the version in the 4.3.4
release) which seems to correct the problem. The
commenting out of lines 71 to 73 in the original file
(73 to 75 in my version) is not essential and is not
part of the fix for this issue, but was done because
those lines seem to me to be another inaccuracy in PHP's
browscap.ini parsing, and their removal does not seem to
adversely affect the functioning of get_browser(),
although I did not extensively test against many user
agent strings, and I do not know the reason that code
was originally inserted.
50c50
< t = (char *) malloc(Z_STRLEN_P(pattern)*2 + 1);
---
> t = (char *) malloc(Z_STRLEN_P(pattern)*2 + 3);
52c52,54
< for (i=0, j=0; i<Z_STRLEN_P(pattern); i++, j++)
{
---
> t[0] = '^';
>
> for (i=0, j=1; i<Z_STRLEN_P(pattern); i++, j++)
{
71,73c73,75
< if (j && (t[j-1] == '.')) {
< t[j++] = '*';
< }
---
> // if (j && (t[j-1] == '.')) {
> // t[j++] = '*';
> // }
74a77,78
> t[j++] = '$';
>
Reproduce code:
---------------
Install the browscap.ini file available from www.garykeith.com and modify
the php.ini to use this file. Then run this:
$ua = 'Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/1999
(KHTML, like Gecko) Safari/1999';
$ua_info = (array) get_browser($ua);
print $ua;
print '<pre>';
print_r($ua_info);
print '</pre>';
Expected result:
----------------
The browscap.ini does not know about Safari version
1999. There is no such version; version 1.2 (125) is
the most recent as of February 2004. And, at least in
the version from a week or so ago, the browscap.ini does
not define a generic "Safari" directive that would allow
the browscap.ini to recognize it. So this user agent
string should match the last rule in the file, "Default
Browser", which has the pattern "*".
Actual result:
--------------
It actually matches the pattern "Mozilla/5.0", in the
Website Strippers category.
--
Edit bug report at http://bugs.php.net/?id=27291&edit=1
--
Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=27291&r=trysnapshot4
Try a CVS snapshot (php5): http://bugs.php.net/fix.php?id=27291&r=trysnapshot5
Fixed in CVS: http://bugs.php.net/fix.php?id=27291&r=fixedcvs
Fixed in release: http://bugs.php.net/fix.php?id=27291&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=27291&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=27291&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=27291&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=27291&r=support
Expected behavior: http://bugs.php.net/fix.php?id=27291&r=notwrong
Not enough info: http://bugs.php.net/fix.php?id=27291&r=notenoughinfo
Submitted twice: http://bugs.php.net/fix.php?id=27291&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=27291&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=27291&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=27291&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=27291&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=27291&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=27291&r=float