New submission from Toshio Kuratomi <[EMAIL PROTECTED]>: Tested on python-3.0rc1 -- Linux Fedora 9
I wanted to make sure that python3.0 would handle url's in different encodings. So I created two files on an apache server which were named ½ñ.html. One of the filenames was encoded in utf-8 and the other in latin-1. Then I tried the following:: from urllib.request import urlopen url = 'http://localhost/u/½ñ.html' urlopen(url.encode('utf-8')).read() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.0/urllib/request.py", line 122, in urlopen return _opener.open(url, data, timeout) File "/usr/lib/python3.0/urllib/request.py", line 350, in open req.timeout = timeout AttributeError: 'bytes' object has no attribute 'timeout' The same thing happens if I give None for the two optional arguments (data and timeout). Next I tried using a raw Unicode string: >>> urlopen(url).read() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.0/urllib/request.py", line 122, in urlopen return _opener.open(url, data, timeout) File "/usr/lib/python3.0/urllib/request.py", line 359, in open response = self._open(req, data) File "/usr/lib/python3.0/urllib/request.py", line 377, in _open '_open', req) File "/usr/lib/python3.0/urllib/request.py", line 337, in _call_chain result = func(*args) File "/usr/lib/python3.0/urllib/request.py", line 1082, in http_open return self.do_open(http.client.HTTPConnection, req) File "/usr/lib/python3.0/urllib/request.py", line 1068, in do_open h.request(req.get_method(), req.get_selector(), req.data, headers) File "/usr/lib/python3.0/http/client.py", line 843, in request self._send_request(method, url, body, headers) File "/usr/lib/python3.0/http/client.py", line 860, in _send_request self.putrequest(method, url, **skips) File "/usr/lib/python3.0/http/client.py", line 751, in putrequest self._output(request.encode('ascii')) UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128) So, in python-3.0rc1, this method is badly broken. ---------- components: Unicode messages: 73982 nosy: a.badger severity: normal status: open title: urllib.request.urlopen does not handle non-ASCII characters versions: Python 3.0 _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3991> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com