Yassine ABOUKIR added the comment:

"Following the syntax specifications in RFC 1808, urlparse recognizes a netloc 

only if it is properly introduced by ‘//’. Otherwise the input is presumed to 
be 

a relative URL and thus to start with a path component." 

https://docs.python.org/2/library/urlparse.html

2015-03-03 22:16 GMT+00:00 Paul McMillan <>:

    Yeah. I agree the lack of round trip is surprising, and I agree we
    should fix it.

    I think the underlying issue here is that urlparse has a pretty
    different view of the world when compared with the browsers. I know
    that bit me when I first started using python, and it periodically
    surfaces in cases like this, where the browser thinks that
    "//evil.com" is a url, but we've parsed it as part of a path.
    Backwards compatibility makes it hard to update urlparse to precisely
    match browser behavior, but there's probably room for a new library
    designed with browser compatibility as a primary feature.

    -Paul

    On Tue, Mar 3, 2015 at 10:07 PM, Antoine Pitrou <> wrote:
    >
    > Hi Paul,
    >
    > Le 03/03/2015 23:01, Paul McMillan a écrit :
    >> I understand how this works. You don't need to paste the example again.
    >>
    >> The documentation makes no guarantee that parse/unparse will do what
    >> you want them to do, and does explicitly lay out the specific rules
    >> used for separating the parts.
    >
    > Well, I don't know if it's a security issue, but failure to roundtrip
    > *is* surprising (and IMHO dangerous for that reason) behaviour to say
    > the least.
    >
    > Moreover, the urlunparse() documentation (in 3.x) says:
    > """
    > Construct a URL from a tuple as returned by urlparse(). [...] This may
    > result in a slightly different, but equivalent URL, if the URL that was
    > parsed originally had unnecessary delimiters
    > """
    > 

(https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlunparse)
    >
    > which implies that any divergence when roundtripping should only consist
    > in cosmetic, not essential, differences ("equivalent URL").
    >
    > Regards
    >
    > Antoine.
    > -----------------------------
    > Python Security Response Team
    > Unsubscribe: https://mail.python.org/mailman/options/psrt/paul

%40mcmillan.ws

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23505>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to