Steven D'Aprano <[email protected]> added the comment: > The “urllib.parse” module generally follows RFC 3986, which does not > allow a literal backslash in the “userinfo” part:
And yet the parse() function seems to allow arbitrary unescaped characters. This is from 3.8.0a0: py> from urllib.parse import urlparse py> urlparse(r'http://spam\eggs!cheese&[email protected]').netloc 'spam\\eggs!cheese&[email protected]' py> urlparse(r'http://spam\eggs!cheese&[email protected]').hostname 'evil.com' If that's a bug, it is a separate bug to this issue. Backslash doesn't seem relevant to the security issue of userinfo being used to mislead: py> urlparse('http://[email protected]').netloc '[email protected]' py> urlparse('http://[email protected]').hostname 'evil.com' If it is relevant, can somebody explain to me how? ---------- _______________________________________ Python tracker <[email protected]> <https://bugs.python.org/issue35748> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
