New submission from Tomas Groth: Running this simple test script produces the traceback show below.
import urllib.request page = urllib.request.urlopen('http://legacy.biblegateway.com/versions/?vid=DN1933&action=getVersionInfo#books') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.4/urllib/request.py", line 461, in open response = meth(req, response) File "/usr/lib/python3.4/urllib/request.py", line 571, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python3.4/urllib/request.py", line 493, in error result = self._call_chain(*args) File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain result = func(*args) File "/usr/lib/python3.4/urllib/request.py", line 676, in http_error_302 return self.parent.open(new, timeout=req.timeout) File "/usr/lib/python3.4/urllib/request.py", line 455, in open response = self._open(req, data) File "/usr/lib/python3.4/urllib/request.py", line 473, in _open '_open', req) File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain result = func(*args) File "/usr/lib/python3.4/urllib/request.py", line 1258, in http_open return self.do_open(http.client.HTTPConnection, req) File "/usr/lib/python3.4/urllib/request.py", line 1232, in do_open h.request(req.get_method(), req.selector, req.data, headers) File "/usr/lib/python3.4/http/client.py", line 1065, in request self._send_request(method, url, body, headers) File "/usr/lib/python3.4/http/client.py", line 1093, in _send_request self.putrequest(method, url, **skips) File "/usr/lib/python3.4/http/client.py", line 957, in putrequest self._output(request.encode('ascii')) UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-32: ordinal not in range(128) Using curl we can see that there is a redirect to an url with a special char: $ curl -vs "http://legacy.biblegateway.com/versions/?vid=DN1933&action=getVersionInfo#books" >DN1933 * Hostname was NOT found in DNS cache * Trying 23.23.93.211... * Connected to legacy.biblegateway.com (23.23.93.211) port 80 (#0) > GET /versions/?vid=DN1933&action=getVersionInfo HTTP/1.1 > User-Agent: curl/7.35.0 > Host: legacy.biblegateway.com > Accept: */* > < HTTP/1.1 301 Moved Permanently * Server nginx/1.4.7 is not blacklisted < Server: nginx/1.4.7 < Date: Fri, 22 Aug 2014 08:35:30 GMT < Content-Type: text/html; charset=UTF-8 < Content-Length: 0 < Connection: keep-alive < X-Powered-By: PHP/5.5.7 < Set-Cookie: bg_id=1b9a80d5e6d545487cfd153d6df65c4e; path=/; domain=.biblegateway.com < Set-Cookie: a9gl=0; path=/; domain=.biblegateway.com < Location: http://legacy.biblegateway.com/versions/Dette-er-Biblen-på-dansk-1933/ < * Connection #0 to host legacy.biblegateway.com left intact When the redirect-url doesn't contain special chars everything works as expected, like with this url: "http://legacy.biblegateway.com/versions/?vid=DNB1930&action=getVersionInfo#books" ---------- components: Library (Lib) messages: 225651 nosy: tomasgroth priority: normal severity: normal status: open title: urllib.request.urlopen raises exception when 30X-redirect url contains non-ascii chars type: behavior versions: Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue22248> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com