New submission from Chihiro Ito <hokou...@sourcewalker.com>:

urllib.parse.urlsplit raises an exception for an url including a non-ascii 
hostname in NFKD form and a port number.

example:
>>> urlsplit('http://\u30d5\u309a:80')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ito/.maltybrew/deen/lib/python3.7/urllib/parse.py", line 437, in 
urlsplit
    _checknetloc(netloc)
  File "/Users/ito/.maltybrew/deen/lib/python3.7/urllib/parse.py", line 407, in 
_checknetloc
    "characters under NFKC normalization")
ValueError: netloc 'プ:80' contains invalid characters under NFKC normalization
>>> urlsplit('http://\u30d5\u309a')
SplitResult(scheme='http', netloc='プ', path='', query='', fragment='')
>>> urlsplit(unicodedata.normalize('NFKC', 'http://\u30d5\u309a:80'))
SplitResult(scheme='http', netloc='プ:80', path='', query='', fragment='')

I believe this behavior was introduced at Python 3.7.3. Python 3.7.2 doesn't 
raise any exception for these lines.

----------
components: Unicode
messages: 340983
nosy: ezio.melotti, hokousya, vstinner
priority: normal
severity: normal
status: open
title: urlsplit doesn't accept a NFKD hostname with a port number
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36742>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to