New submission from STINNER Victor <vstin...@python.org>:

David Schütz reported the following urllib vulnerability to the PSRT at 
2020-03-29.

He wrote an article about a similar vulnerability in Closure (Javascript):
https://bugs.xdavidhu.me/google/2020/03/08/the-unexpected-google-wide-domain-check-bypass/

David was able to bypass a wildcard domain check in Closure by using the "\" 
character in the URL like this:

  https://xdavidhu.me\test.corp.google.com

Example in Python:

>>> from urllib.parse import urlparse
>>> urlparse("https://xdavidhu.me\\test.corp.google.com";)
ParseResult(scheme='https', netloc='xdavidhu.me\\test.corp.google.com', 
path='', params='', query='', fragment='')

urlparse() currently accepts "\" in the netloc.

This could present issues if server-side checks are used by applications to 
validate a URLs authority.

The problem emerges from the fact that the RFC and the WHATWG specifications 
differ, and the RFC does not mention the "\":

* RFC: https://tools.ietf.org/html/rfc3986#appendix-B
* WHATWG: https://url.spec.whatwg.org/#relative-state

This specification difference might cause issues, since David do understand 
that the parser is implemented by the RFC, but the WHATWG spec is what the 
browsers are using, who will mainly be the ones opening the URL.

----------
components: Library (Lib)
messages: 366832
nosy: vstinner
priority: normal
severity: normal
status: open
title: [Security] urllib and anti-slash (\) in the hostname
type: security
versions: Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40338>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to