[issue36338] urlparse of urllib returns wrong hostname

2021-12-02 Thread STINNER Victor
Change by STINNER Victor : -- nosy: -vstinner ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36338] urlparse of urllib returns wrong hostname

2021-12-02 Thread Irit Katriel
Irit Katriel added the comment: It produces a deprecation warning on 3.11, but still does the same. >>> urlparse('http://benign.com\[attacker.com]').hostname :1: DeprecationWarning: invalid escape sequence '\[' 'attacker.com' -- nosy: +iritkatriel versions: +Python 3.10, Python 3.11

[issue36338] urlparse of urllib returns wrong hostname

2019-10-24 Thread STINNER Victor
STINNER Victor added the comment: OMG parsing an URL is a can of worms... There are so many open issues related to URL parsing! * bpo-18191: urllib.parse.splitport("::1") * bpo-20271: urllib.parse.urlparse('http://[::1]spam:80') * bpo-28841: urlparse.urlparse() parses invalid URI without

[issue36338] urlparse of urllib returns wrong hostname

2019-10-15 Thread STINNER Victor
STINNER Victor added the comment: I modified my PR 16780 to also fix bpo-33342: "urllib IPv6 parsing fails with special characters in passwords". -- ___ Python tracker ___

[issue36338] urlparse of urllib returns wrong hostname

2019-10-14 Thread STINNER Victor
STINNER Victor added the comment: I proposed PR 16780 which makes the urllib.parse module way more stricter: * the IPv6 address is validated by ipaddress.IPv6Address() parser * invalid characters are rejected in the IPv6 zone: "%", "[" and "]" * the port number is now validated when parsing

[issue36338] urlparse of urllib returns wrong hostname

2019-10-14 Thread STINNER Victor
Change by STINNER Victor : -- pull_requests: +16342 pull_request: https://github.com/python/cpython/pull/16780 ___ Python tracker ___

[issue36338] urlparse of urllib returns wrong hostname

2019-09-10 Thread STINNER Victor
STINNER Victor added the comment: Python 3.5 and newer are impacted, but Python 2.7 behaves differently: vstinner@apu$ python2 Python 2.7.16 (default, Apr 30 2019, 15:54:43) >>> from urlparse import urlparse >>> urlparse('http://demo.com[attacker.com]').hostname 'emo.com[attacker.com'

[issue36338] urlparse of urllib returns wrong hostname

2019-09-10 Thread STINNER Victor
STINNER Victor added the comment: To be clear, the \ in 'http://benign.com\[attacker.com]' is not needed to reproduce the bug: vstinner@apu$ python3 Python 3.7.4 (default, Jul 9 2019, 16:32:37) >>> from urllib.parse import urlparse; >>>

[issue36338] urlparse of urllib returns wrong hostname

2019-09-09 Thread Christian Heimes
Christian Heimes added the comment: The guidelines https://url.spec.whatwg.org/#host-parsing make a lot of sense to me. Python should refuse hostnames with "[" unless * the hostname starts with "[" * the hostname ends with "]" * the string between [] is a valid IPv6 address (full or

[issue36338] urlparse of urllib returns wrong hostname

2019-09-09 Thread Christian Heimes
Change by Christian Heimes : -- priority: normal -> high versions: +Python 3.9 ___ Python tracker ___ ___ Python-bugs-list mailing

[issue36338] urlparse of urllib returns wrong hostname

2019-08-07 Thread Xianbo Wang
Xianbo Wang added the comment: Python2 urlparse.urlparse and urllib2.urlparse.urlparse have a similar IPv6 hostname parsing bug. >>> urlparse.urlparse('http://nevil.com[]').hostname >>> 'evil.com[' This is less practical to exploit since the parsed domain contains a '[' in the end. Do I

[issue36338] urlparse of urllib returns wrong hostname

2019-07-21 Thread jpic
Change by jpic : -- nosy: +jpic ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36338] urlparse of urllib returns wrong hostname

2019-07-21 Thread jpic
Change by jpic : -- pull_requests: +14677 pull_request: https://github.com/python/cpython/pull/14896 ___ Python tracker ___ ___

[issue36338] urlparse of urllib returns wrong hostname

2019-05-15 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: -13146 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36338] urlparse of urllib returns wrong hostname

2019-05-10 Thread Pierre Glaser
Change by Pierre Glaser : -- pull_requests: +13146 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36338] urlparse of urllib returns wrong hostname

2019-03-27 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: I found this page to be uesful : https://url.spec.whatwg.org/#host-parsing and following the steps it seems that this should raise an error since at the 7th step it denotes that asciiDomain shouldn't contain forbidden host code point including

[issue36338] urlparse of urllib returns wrong hostname

2019-03-27 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- pull_requests: -12526 ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue36338] urlparse of urllib returns wrong hostname

2019-03-27 Thread Ronald Oussoren
Ronald Oussoren added the comment: Given a quick scan of RFC 3986[1] I'd say that the behaviour of Ruby seems to be the most correct. That said, I'd also check what the major browsers do in this case (FWIW both FF and Safari use 'benign.com' as the hostname in this case). [1]

[issue36338] urlparse of urllib returns wrong hostname

2019-03-27 Thread Pierre Glaser
Change by Pierre Glaser : -- pull_requests: +12526 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36338] urlparse of urllib returns wrong hostname

2019-03-27 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- pull_requests: -12525 ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue36338] urlparse of urllib returns wrong hostname

2019-03-27 Thread Pierre Glaser
Change by Pierre Glaser : -- pull_requests: +12525 stage: -> patch review ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue36338] urlparse of urllib returns wrong hostname

2019-03-22 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: See also issue20271 that discusses the other format http://[::1]spam where ::1 is returned as hostname. urlparse tries to parse the hostname as IPV6 address when there is [ and parses till ] at [0] thus "benign.com\[attacker.com]" is treated as a

[issue36338] urlparse of urllib returns wrong hostname

2019-03-21 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- pull_requests: -12435 ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue36338] urlparse of urllib returns wrong hostname

2019-03-21 Thread Pierre Glaser
Change by Pierre Glaser : -- keywords: +patch pull_requests: +12435 stage: -> patch review ___ Python tracker ___ ___

[issue36338] urlparse of urllib returns wrong hostname

2019-03-18 Thread Stéphane Wirtel
Change by Stéphane Wirtel : -- versions: +Python 3.8 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36338] urlparse of urllib returns wrong hostname

2019-03-18 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- nosy: +martin.panter ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36338] urlparse of urllib returns wrong hostname

2019-03-18 Thread Stéphane Wirtel
Stéphane Wirtel added the comment: Here is a unittest where you can test this issue and the result on Python 3.8.0a2 and 3.7.2 >>> 3.8.0a2 ./python /tmp/test_bug_36338.py /tmp/test_bug_36338.py:8: SyntaxWarning: invalid escape sequence \[ url = 'http://demo.com\[attacker.com]' 3.8.0a2+

[issue36338] urlparse of urllib returns wrong hostname

2019-03-18 Thread Stéphane Wirtel
Stéphane Wirtel added the comment: I can confirm with 3.7.2 on fedora 29 -- nosy: +matrixise, orsenthil ___ Python tracker ___ ___

[issue36338] urlparse of urllib returns wrong hostname

2019-03-18 Thread Xianbo Wang
New submission from Xianbo Wang : The urlparse function in Python urllib returns the wrong hostname when parsing URL crafted by the malicious user. This may be caused by incorrect handling of IPv6 addresses. The bug could lead to open redirect in web applications which rely on urlparse to