#21226 [Bgs-Opn]: function parse_url() fails
ID: 21226 Updated by: [EMAIL PROTECTED] Reported By: [EMAIL PROTECTED] -Status: Bogus +Status: Open Bug Type: *URL Functions Operating System: w2000 PHP Version: 4.3.0 Assigned To: iliaa New Comment: Reopening this bug. A closer look at RFC 2396 indicates that: ... This generic URI syntax consists of a sequence of four main components: scheme://authoritypath?query ... . .. absoluteURI = scheme : ( hier_part | opaque_part ) URI that are hierarchical in nature use the slash / character for separating hierarchical components. ... ... hier_part = ( net_path | abs_path ) [ ? query ] net_path = // authority [ abs_path ] abs_path = / path_segments URI that do not make use of the slash / character for separating hierarchical components are considered opaque by the generic URI parser. opaque_part = uric_no_slash *uric uric_no_slash = unreserved | escaped | ; | ? | : | @ | | = | + | $ | , ... Later in section 3.3 of that RFC the syntax of the path component is clarified. Similar clarification is made in section 3.2 on what is considered as a correct authority component. Bottomline the $url given by the bug reporter is mostly conformant to being a hierarchical URI in nature, although not the usual case. As section 3.2 that deals w/ the authority component states that: ... The authority component is preceded by a double slash // and is terminated by the next slash /, question-mark ?, or by the end of the URI. Within the authority component, the characters ;, :, @, ?, and / are reserved. ... And that is reinforced in the BNF syntax later in the RFC. Not sure if all web servers will interpret correctly a URL w/o a path but w/ a query part immediately after the authority part, in view of the fact that the /' in the path is usually internally mapped by the server to wherever the physical files are in the filesystem. The following code works as expected: $url = http://user:[EMAIL PROTECTED]:8080/foo.php?bar=1boom=0;; print_r(parse_url($url)); Giving as output: Array ( [scheme] = http [host] = www.example.com [port] = 8080 [user] = user [pass] = passwd [path] = /foo.php [query] = bar=1boom=0 ) Tested w/ current CVS head on a RH Linux 6.1 machine: $ php_cvs -v PHP 4.4.0-dev (cli) (built: Dec 27 2002 14:00:56) Copyright (c) 1997-2002 The PHP Group Zend Engine v1.4.0, Copyright (c) 1998-2002 Zend Technologies as well as 4.3.0 (on the same OS) $ php -v PHP 4.3.0 (cli) (built: Dec 29 2002 23:59:53) Copyright (c) 1997-2002 The PHP Group Zend Engine v1.3.0, Copyright (c) 1998-2002 Zend Technologies Previous Comments: [2002-12-28 09:10:25] [EMAIL PROTECTED] Thank you for your works on my report, but I'm suprised you pass this report as bogus since : - 'port' was not number in my example, but it was only to be more comprehensive. Warning is the same with a digit port. - Same example was working successfully without warning in 4.2 and previous. - parse_url function manual doesn't tell about trailing slashes before path or query part of url (I triple check ;-) - RFC 2396 (Uniform Resource Identifiers (URI): Generic Syntax) doesn't specify that you *MUST* have / after port part. So if you consider these points, either you should modify parse_url function or modify parse_url documentation. But please do not just pass report as bogus!!! At worst put it a 'closed'. Thanks for your help. [2002-12-28 00:39:43] [EMAIL PROTECTED] Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php Port can only be a numeric number from 0-9, although in reality the port range is from 1-65535, clearly a non numeric port number is not valid, hence invalidating the passed URL. The 2nd example is also wrong, without the '/' between the port the rest of the request the code MUST assume that the following data is part of the port, hence the URL is not valid once again. This is NOT a bug. [2002-12-27 19:21:59] [EMAIL PROTECTED] Add / after end of port part is a good solution. Thanks. Do you consider that it's a bug or parse_url is url RFC compliant ? [2002-12-27 19:04:18] [EMAIL PROTECTED] Seems to come from 'port' part of url. If we consider this :
#21226 [Bgs-Opn]: function parse_url() fails
ID: 21226 Updated by: [EMAIL PROTECTED] Reported By: [EMAIL PROTECTED] -Status: Bogus +Status: Open Bug Type: *URL Functions Operating System: w2000 PHP Version: 4.3.0 Assigned To: iliaa Previous Comments: [2002-12-30 02:31:30] [EMAIL PROTECTED] Sorry, but your problem does not imply a bug in PHP itself. For a list of more appropriate places to ask for help using PHP, please visit http://www.php.net/support.php as this bug system is not the appropriate forum for asking support questions. Thank you for your interest in PHP. There are two things wrong with your $url value: 1) portnumber is not a valid portnumber, this must be a number 2) There must be a / between the portnumber and ANYTHING following. This does also include the ?foo query string specified in your case where the document requested is the directory index. [2002-12-30 02:03:43] [EMAIL PROTECTED] Reopening this bug. A closer look at RFC 2396 indicates that: ... This generic URI syntax consists of a sequence of four main components: scheme://authoritypath?query ... . .. absoluteURI = scheme : ( hier_part | opaque_part ) URI that are hierarchical in nature use the slash / character for separating hierarchical components. ... ... hier_part = ( net_path | abs_path ) [ ? query ] net_path = // authority [ abs_path ] abs_path = / path_segments URI that do not make use of the slash / character for separating hierarchical components are considered opaque by the generic URI parser. opaque_part = uric_no_slash *uric uric_no_slash = unreserved | escaped | ; | ? | : | @ | | = | + | $ | , ... Later in section 3.3 of that RFC the syntax of the path component is clarified. Similar clarification is made in section 3.2 on what is considered as a correct authority component. Bottomline the $url given by the bug reporter is mostly conformant to being a hierarchical URI in nature, although not the usual case. As section 3.2 that deals w/ the authority component states that: ... The authority component is preceded by a double slash // and is terminated by the next slash /, question-mark ?, or by the end of the URI. Within the authority component, the characters ;, :, @, ?, and / are reserved. ... And that is reinforced in the BNF syntax later in the RFC. Not sure if all web servers will interpret correctly a URL w/o a path but w/ a query part immediately after the authority part, in view of the fact that the /' in the path is usually internally mapped by the server to wherever the physical files are in the filesystem. The following code works as expected: $url = http://user:[EMAIL PROTECTED]:8080/foo.php?bar=1boom=0;; print_r(parse_url($url)); Giving as output: Array ( [scheme] = http [host] = www.example.com [port] = 8080 [user] = user [pass] = passwd [path] = /foo.php [query] = bar=1boom=0 ) Tested w/ current CVS head on a RH Linux 6.1 machine: $ php_cvs -v PHP 4.4.0-dev (cli) (built: Dec 27 2002 14:00:56) Copyright (c) 1997-2002 The PHP Group Zend Engine v1.4.0, Copyright (c) 1998-2002 Zend Technologies as well as 4.3.0 (on the same OS) $ php -v PHP 4.3.0 (cli) (built: Dec 29 2002 23:59:53) Copyright (c) 1997-2002 The PHP Group Zend Engine v1.3.0, Copyright (c) 1998-2002 Zend Technologies [2002-12-28 09:10:25] [EMAIL PROTECTED] Thank you for your works on my report, but I'm suprised you pass this report as bogus since : - 'port' was not number in my example, but it was only to be more comprehensive. Warning is the same with a digit port. - Same example was working successfully without warning in 4.2 and previous. - parse_url function manual doesn't tell about trailing slashes before path or query part of url (I triple check ;-) - RFC 2396 (Uniform Resource Identifiers (URI): Generic Syntax) doesn't specify that you *MUST* have / after port part. So if you consider these points, either you should modify parse_url function or modify parse_url documentation. But please do not just pass report as bogus!!! At worst put it a 'closed'. Thanks for your help. [2002-12-28 00:39:43] [EMAIL PROTECTED] Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php Port can only be a numeric