New submission from STINNER Victor <vstin...@python.org>: David Schütz reported the following urllib vulnerability to the PSRT at 2020-03-29.
He wrote an article about a similar vulnerability in Closure (Javascript): https://bugs.xdavidhu.me/google/2020/03/08/the-unexpected-google-wide-domain-check-bypass/ David was able to bypass a wildcard domain check in Closure by using the "\" character in the URL like this: https://xdavidhu.me\test.corp.google.com Example in Python: >>> from urllib.parse import urlparse >>> urlparse("https://xdavidhu.me\\test.corp.google.com") ParseResult(scheme='https', netloc='xdavidhu.me\\test.corp.google.com', path='', params='', query='', fragment='') urlparse() currently accepts "\" in the netloc. This could present issues if server-side checks are used by applications to validate a URLs authority. The problem emerges from the fact that the RFC and the WHATWG specifications differ, and the RFC does not mention the "\": * RFC: https://tools.ietf.org/html/rfc3986#appendix-B * WHATWG: https://url.spec.whatwg.org/#relative-state This specification difference might cause issues, since David do understand that the parser is implemented by the RFC, but the WHATWG spec is what the browsers are using, who will mainly be the ones opening the URL. ---------- components: Library (Lib) messages: 366832 nosy: vstinner priority: normal severity: normal status: open title: [Security] urllib and anti-slash (\) in the hostname type: security versions: Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue40338> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com