ID:               27291
 Comment by:       php_bug_27291 at garykeith dot com
 Reported By:      php-bug-NOSPAM-2004 at ryandesign dot com
 Status:           Closed
 Bug Type:         *General Issues
 Operating System: Mac OS X; FreeBSD; RedHet Linux
 PHP Version:      4.3.4
 New Comment:

Respectfully, my latest browscap.ini does not detect all arbitrary
versions of Safari. I'm not sure how you arrived at that conclusion.



I do know that I receive e-mails nearly every day about this issue so
there is obviously a problem somewhere.



I don't know who is working on the code for get_browser() these days
but I wish they would contact me so we could come to some sort of
understanding about how to properly parse my file the way browscap.dll
does. I am growing very weary of my files and efforts taking the blame
for the non-stop stream of bugs that emanate from get_browser().



Thanks,

~gary.


Previous Comments:
------------------------------------------------------------------------

[2004-02-17 18:08:59] php-bug-NOSPAM-2004 at ryandesign dot com

I downloaded and compiled 4.3.5RC3 and the issue 

remains. I am not versed in CVS, and I was unable to 

compile the version I got from the CVS server. It 

complained about requiring libxml, which I did not 

request. I did not see any changes in the browscap.c 

file on the CVS server (when viewed through its web 

interface) which would account for any change in its 

behavior. I will test again with 4.3.5 final when it is 

released.



Perhaps a better UA string to test would be this...



Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) 

UnknownWebKit/555 (KHTML, like Gecko) UnknownBrowser/444



...since the 2/15/2004 browscap.ini from garykeith.com 

does now detect arbitrary Safari versions. The above UA, 

however, still is identified as a Website Stripper, 

though it should be identified as a Default Browser.

------------------------------------------------------------------------

[2004-02-17 16:12:58] [EMAIL PROTECTED]

Using latest stable CVS snapshot does match with "Default Browser"..



------------------------------------------------------------------------

[2004-02-17 14:01:22] php-bug-NOSPAM-2004 at ryandesign dot com

Description:
------------
PHP's get_browser() function does not correctly use the 

patterns in the browscap.ini file, resulting in 

occasional incorrect matches. This occurred, for 

example, when Apple released Safari 1.2, and when 

OmniGroup released OmniWeb 5.0b1. These two browsers 

were then incorrectly identified as crawlers / robots, 

instead of being recognized as normal browsers.



Instead of matching the last rule in the file (which has 

the browscap pattern "*" which PHP translates into the 

regular expression ".*"), it matches the rule for 

Website Strippers (which has the browscap pattern 

"Mozilla/5.0" which PHP translates to the regular 

expression "Mozilla/5\.0"). Yes, Safari and OmniWeb have 

"Mozilla/5.0" as part of their user agent string, but 

only part. "Mozilla/5.0" is not the ENTIRE UA string, 

which is what the browscap pattern is intending to 

define. Had the rule been intended to match "Mozilla/

5.0" at the start of the string, regardless of what 

followed, the rule would have been written "Mozilla/

5.0*". But it wasn't. PHP needs to anchor the regular 

expression it generates to the beginning and end of the 

string to ensure it is matching the portion of the 

string the browscap.ini author intended it to match. The 

regular expressions PHP should have generated are 

"^Mozilla/5\.0$" and "^.*$".



Here is a diff of the PHP source code file

ext/standard/browscap.c (from the version in the 4.3.4 

release) which seems to correct the problem. The 

commenting out of lines 71 to 73 in the original file 

(73 to 75 in my version) is not essential and is not 

part of the fix for this issue, but was done because 

those lines seem to me to be another inaccuracy in PHP's 

browscap.ini parsing, and their removal does not seem to 

adversely affect the functioning of get_browser(), 

although I did not extensively test against many user 

agent strings, and I do not know the reason that code 

was originally inserted.



50c50

<       t = (char *) malloc(Z_STRLEN_P(pattern)*2 + 1);

---

>       t = (char *) malloc(Z_STRLEN_P(pattern)*2 + 3);

52c52,54

<       for (i=0, j=0; i<Z_STRLEN_P(pattern); i++, j++) 

{

---

>       t[0] = '^';

> 

>       for (i=0, j=1; i<Z_STRLEN_P(pattern); i++, j++) 

{

71,73c73,75

<       if (j && (t[j-1] == '.')) {

<               t[j++] = '*';

<       }

---

> //    if (j && (t[j-1] == '.')) {

> //            t[j++] = '*';

> //    }

74a77,78

>       t[j++] = '$';

> 

Reproduce code:
---------------
Install the browscap.ini file available from www.garykeith.com and
modify the php.ini to use this file. Then run this:



$ua = 'Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/1999
(KHTML, like Gecko) Safari/1999';



$ua_info = (array) get_browser($ua);



print $ua;



print '<pre>';

print_r($ua_info);

print '</pre>';

Expected result:
----------------
The browscap.ini does not know about Safari version 

1999. There is no such version; version 1.2 (125) is 

the most recent as of February 2004. And, at least in 

the version from a week or so ago, the browscap.ini does 

not define a generic "Safari" directive that would allow 

the browscap.ini to recognize it. So this user agent 

string should match the last rule in the file, "Default 

Browser", which has the pattern "*".

Actual result:
--------------
It actually matches the pattern "Mozilla/5.0", in the 

Website Strippers category.


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=27291&edit=1

Reply via email to