[issue30500] [security] urllib connects to a wrong host

STINNER Victor Tue, 20 Jun 2017 07:41:36 -0700

STINNER Victor added the comment:

I tested my system python2 (Python 2.7.13 on Fedora 25):


haypo@selma$ python2
Python 2.7.13 (default, May 10 2017, 20:04:28) 
>>> urllib.splithost('//hostname/url')
('hostname', '/url')
>>> urllib.splithost('//host\nname/url')  # newline in hostname, accepted
('host\nname', '/url')
>>> print(urllib.splithost('//host\nname/url')[0])  # newline in hostname, 
>>> accepted
host
name
>>> urllib.splithost('//hostname/ur\nl')  # newline in URL, rejected
(None, '//hostname/ur\nl')

=> Newline is accepted in the hostname, but not in the URL path.


With my change (adding DOTALL), newlines are accepted in the hostname and in 
the URL:

haypo@selma$ ./python
Python 2.7.13+ (heads/2.7:b39a748, Jun 19 2017, 18:07:19) 
>>> import urllib
>>> urllib.splithost("//hostname/url")
('hostname', '/url')
>>> urllib.splithost("//host\nname/url")  # newline in hostname, accepted
('host\nname', '/url')
>>> urllib.splithost("//hostname/ur\nl")  # newline in URL, accepted
('hostname', '/ur\nl')


More generally, it seems like the urllib module doesn't try to validate URL 
content. It just try to "split" them.

Who is responsible to validate URLs? Is it the responsability of the 
application developer to implement a whitelist?

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue30500>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue30500] [security] urllib connects to a wrong host

Reply via email to