Edit report at https://bugs.php.net/bug.php?id=54369&edit=1

 ID:                 54369
 Comment by:         lenzai2004-dev at yahoo dot com
 Reported by:        tomas dot brastavicius at quantum dot lt
 Summary:            [PATCH] parse_url() incorrectly determines the start
                     of query and fragment parts
 Status:             Open
 Type:               Bug
 Package:            URL related
 PHP Version:        Irrelevant
 Block user comment: N
 Private report:     N

 New Comment:

The point is not about wether the patch is relevant or not.

But for this bug and other cases, parse_url is returning corrupt result.
It could be fixed in 2 ways:
- patch it 
- or detect invalid url and return error.

I've been trying to use this function and after significant volume of URLs I 
always find cases where it returns incorrect data.
I had to rewrite everything in PHP and it's quite slow.


Previous Comments:
------------------------------------------------------------------------
[2011-05-17 20:12:50] tomas dot brastavicius at quantum dot lt

Changed report name as described in the bug report spec.

------------------------------------------------------------------------
[2011-04-03 19:36:33] tokul at users dot sourceforge dot net

You can't argue that function is broken and needs fixes, if you feed broken 
data and expect good output. Use valid urls in your tests, if you want to show 
that function is broken.

------------------------------------------------------------------------
[2011-04-03 18:36:42] tomas dot brastavicius at quantum dot lt

One more comment about this issue: 
http://marc.info/?l=php-internals&m=130183094107548&w=2

------------------------------------------------------------------------
[2011-04-03 18:09:08] tomas dot brastavicius at quantum dot lt

Another comment about this issue: 
http://marc.info/?l=php-internals&m=130183032307080&w=2


@Peter
Yes, according to RFC 1738 the test URLs are not valid. But:

1. It is not defined that parse_url() parses URLs according to RFC 1738.

2. parse_url() "is not meant to validate given URL". See 
http://php.net/manual/en/function.parse-url.php

3. Why it is better to return invalid hostname ("#" and "/" are invalid 
characters, current parse_url() version) instead of invalid query or fragment 
(patched parse_url() version) ?


@tokul at users dot sourceforge dot net
Checked


My arguments for the patch acceptance are as follows:

1. parse_url() documentation's "Return Values" section clearly states that 
query and fragment component starts after "?" and "#" character respectively.

2. I don't know any specification that allows "#" and "?" in the hostnames 
(someone knows ?) but I know at least RFC3986 (unfortunately I am working with) 
that allows "/" character in both query and fragment parts. See 
http://tools.ietf.org/html/rfc3986.html#section-3.4 and 
http://tools.ietf.org/html/rfc3986.html#section-3.5

3. It has been already stated (although different content) that parse_url() 
parses URLs according to RFC3986. See http://bugs.php.net/bug.php?id=50484. May 
be Adam Harvey knows more ?

------------------------------------------------------------------------
[2011-04-03 14:10:58] tokul at users dot sourceforge dot net

Check url encoding documentation first.
http://en.wikipedia.org/wiki/Percent-encoding

Then fix your $url value. You use reserved character for other purpose.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=54369


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=54369&edit=1

Reply via email to