Charles-Francois Natali <neolo...@free.fr> added the comment: Alright, what happens is the following: - the file you're trying to retrieve is actually redirected, so the server send a HTTP/1.X 302 Moved Temporarily - in urllib, when we get a redirection, we call redirect_internal: def redirect_internal(self, url, fp, errcode, errmsg, headers, data): if 'location' in headers: newurl = headers['location'] elif 'uri' in headers: newurl = headers['uri'] else: return void = fp.read() fp.close() # In case the server sent a relative URL, join with original: newurl = basejoin(self.type + ":" + url, newurl) return self.open(newurl)
the fp.read() is there to wait for the remote end to close connection The problem, in this case, is that with Python 3.1, httplib uses HTTP/1.1 instead of HTTP/1.0 in version 2.6, and with HTTP/1.1 the server doesn't close the connection after sending the redirect (shown by tcpdump). So, the process remains stuck on fp.read(). Now, in version 3.1, if we simply change Lib/http/client.py:628 from class HTTPConnection: _http_vsn = 11 _http_vsn_str = 'HTTP/1.1' to class HTTPConnection: _http_vsn = 11 _http_vsn_str = 'HTTP/1.0' to use HTTP/1.0 instead, the retrieval works fine. Obviously, this is not a good solution. Since the RFC doesn't seem to require the server to close the connection after sending a redirect, we'd probably better close the connection ourselves. That's what the attached patch does, it simply removes the call to fp.read() before closing the connection. It also removes this for http_error_default, since if an error occurs, we probably want to close the connection as soon as possible instead of waiting for server to do so. ---------- keywords: +patch nosy: +neologix Added file: http://bugs.python.org/file16758/urllib_redirect.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8035> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com