From:             php-bug-NOSPAM-2004 at ryandesign dot com
Operating system: Mac OS X; FreeBSD; RedHet Linux
PHP version:      4.3.4
PHP Bug Type:     Unknown/Other Function
Bug description:  get_browser matches browscap.ini patterns incorrectly

Description:
------------
PHP's get_browser() function does not correctly use the 

patterns in the browscap.ini file, resulting in 

occasional incorrect matches. This occurred, for 

example, when Apple released Safari 1.2, and when 

OmniGroup released OmniWeb 5.0b1. These two browsers 

were then incorrectly identified as crawlers / robots, 

instead of being recognized as normal browsers.



Instead of matching the last rule in the file (which has 

the browscap pattern "*" which PHP translates into the 

regular expression ".*"), it matches the rule for 

Website Strippers (which has the browscap pattern 

"Mozilla/5.0" which PHP translates to the regular 

expression "Mozilla/5\.0"). Yes, Safari and OmniWeb have 

"Mozilla/5.0" as part of their user agent string, but 

only part. "Mozilla/5.0" is not the ENTIRE UA string, 

which is what the browscap pattern is intending to 

define. Had the rule been intended to match "Mozilla/

5.0" at the start of the string, regardless of what 

followed, the rule would have been written "Mozilla/

5.0*". But it wasn't. PHP needs to anchor the regular 

expression it generates to the beginning and end of the 

string to ensure it is matching the portion of the 

string the browscap.ini author intended it to match. The 

regular expressions PHP should have generated are 

"^Mozilla/5\.0$" and "^.*$".



Here is a diff of the PHP source code file

ext/standard/browscap.c (from the version in the 4.3.4 

release) which seems to correct the problem. The 

commenting out of lines 71 to 73 in the original file 

(73 to 75 in my version) is not essential and is not 

part of the fix for this issue, but was done because 

those lines seem to me to be another inaccuracy in PHP's 

browscap.ini parsing, and their removal does not seem to 

adversely affect the functioning of get_browser(), 

although I did not extensively test against many user 

agent strings, and I do not know the reason that code 

was originally inserted.



50c50

<       t = (char *) malloc(Z_STRLEN_P(pattern)*2 + 1);

---

>       t = (char *) malloc(Z_STRLEN_P(pattern)*2 + 3);

52c52,54

<       for (i=0, j=0; i<Z_STRLEN_P(pattern); i++, j++) 

{

---

>       t[0] = '^';

> 

>       for (i=0, j=1; i<Z_STRLEN_P(pattern); i++, j++) 

{

71,73c73,75

<       if (j && (t[j-1] == '.')) {

<               t[j++] = '*';

<       }

---

> //    if (j && (t[j-1] == '.')) {

> //            t[j++] = '*';

> //    }

74a77,78

>       t[j++] = '$';

> 

Reproduce code:
---------------
Install the browscap.ini file available from www.garykeith.com and modify
the php.ini to use this file. Then run this:



$ua = 'Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/1999
(KHTML, like Gecko) Safari/1999';



$ua_info = (array) get_browser($ua);



print $ua;



print '<pre>';

print_r($ua_info);

print '</pre>';

Expected result:
----------------
The browscap.ini does not know about Safari version 

1999. There is no such version; version 1.2 (125) is 

the most recent as of February 2004. And, at least in 

the version from a week or so ago, the browscap.ini does 

not define a generic "Safari" directive that would allow 

the browscap.ini to recognize it. So this user agent 

string should match the last rule in the file, "Default 

Browser", which has the pattern "*".

Actual result:
--------------
It actually matches the pattern "Mozilla/5.0", in the 

Website Strippers category.

-- 
Edit bug report at http://bugs.php.net/?id=27291&edit=1
-- 
Try a CVS snapshot (php4):  http://bugs.php.net/fix.php?id=27291&r=trysnapshot4
Try a CVS snapshot (php5):  http://bugs.php.net/fix.php?id=27291&r=trysnapshot5
Fixed in CVS:               http://bugs.php.net/fix.php?id=27291&r=fixedcvs
Fixed in release:           http://bugs.php.net/fix.php?id=27291&r=alreadyfixed
Need backtrace:             http://bugs.php.net/fix.php?id=27291&r=needtrace
Need Reproduce Script:      http://bugs.php.net/fix.php?id=27291&r=needscript
Try newer version:          http://bugs.php.net/fix.php?id=27291&r=oldversion
Not developer issue:        http://bugs.php.net/fix.php?id=27291&r=support
Expected behavior:          http://bugs.php.net/fix.php?id=27291&r=notwrong
Not enough info:            http://bugs.php.net/fix.php?id=27291&r=notenoughinfo
Submitted twice:            http://bugs.php.net/fix.php?id=27291&r=submittedtwice
register_globals:           http://bugs.php.net/fix.php?id=27291&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=27291&r=php3
Daylight Savings:           http://bugs.php.net/fix.php?id=27291&r=dst
IIS Stability:              http://bugs.php.net/fix.php?id=27291&r=isapi
Install GNU Sed:            http://bugs.php.net/fix.php?id=27291&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=27291&r=float

Reply via email to