I have been looking into CVE-2019-9636 and I'm not sure that python code that works in bytes is vulnerable to this.
The "trick" that to make the CVE dangerous assumes that you have a unicode string with \uff03 (FULLWIDTH NUMBER SIGN') that under NFKC turns into '#'. The discussion in https://bugs.python.org/issue36216 all the explaination starts with unicode string. What I'm interested in what happens if you get the URL as part of a HTML page or other mechanism. To that end I made a URL that if the vulnerability is triggered will change which search engine is visited. b'http://google.xn--combing-xr93b.com/fred' And when I use urlsplit() I get this: print( urlparse.urlsplit('http://google.xn--combing-xr93b.com/fred') ) SplitResult(scheme='http', netloc='google.xn--combing-xr93b.com', path='/fred', query='', fragment='') The netloc is still IDNA encoded so the "trick" did not trigger. If code then uses that netloc its going to fail to return anything as no domain name registrar should have register a name with illegal \uff03 in it. Also this raises an exception: 'google.xn--combing-xr93b.com'.decode('idna') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.7/encodings/idna.py", line 193, in decode result.append(ToUnicode(label)) File "/usr/lib64/python2.7/encodings/idna.py", line 139, in ToUnicode raise UnicodeError("IDNA does not round-trip", label, label2) UnicodeError: ('IDNA does not round-trip', 'xn--combing-xr93b', 'com#bing') The conclusion I reached is that the CVE only applies to client code that allows a URL in unicode to be entered. Have I missed something important in the analysis? Barry -- https://mail.python.org/mailman/listinfo/python-list