Martin Panter added the comment: In general, HTTP URLs are supposed to be ASCII only. Newer protocols (e.g. RTSP which is based on HTTP) specifically allow UTF-8 encoding. But it would be wrong for Python’s HTTP library to assume UTF-8 is wanted everywhere. Especially in a domain name (e.g. in the full-URL request to a proxy), which should not be UTF-8 encoded.
I suggest to work on handling IRIs (<https://tools.ietf.org/html/rfc3987>, basically Unicode URLs) in higher-level places like “urllib”. See Issue 3991. ---------- nosy: +martin.panter resolution: -> rejected status: open -> closed superseder: -> urllib.request.urlopen does not handle non-ASCII characters title: encoding to ascii in client.py -> encoding to ascii in http/client.py type: compile error -> enhancement _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29305> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com