Hello,
I've recently been playing around with urllib.FancyURLopener and noticed
that under certain conditions it can block after calling open() on a
url. It only happens on specific servers and when the "Range" HTTP
header is in use. The server doesn't close the connection and
redirect_internal gets stuck while trying to do a read (it's gets stuck
at the line containing "void = fp.read()").
Here is a simple example that demonstrates this problem:
#!/usr/bin/env python
import urllib
# A url which causes a 302 on the server.
url = 'http://chkpt.zdnet.com/chkpt/1pcast.bole/http://podcast'
url += '-files.cnet.com/podcast/cnet_buzzoutloud_060209.mp3'
d = urllib.FancyURLopener()
# This header causes this particular server (most servers behave
# normally) to keep the connection open for whatever reason...
d.addheader('Range', 'bytes=100-')
# The program will block here as we wait for redirect_internal to
# do it's "void = fp.read()" but since the server doesn't close the
# connection we end up waiting for the connection to timeout.
d.open(url)
To work around this, I subclass FancyURLopener and define my own version
of redirect_internal that has the "void = fp.read()" line commented out.
What I'd like to know is what's the point of doing the read() and not
using the result? Is this a bug in urllib? Or am I simply doing
something wrong?
Thanks,
nick
--
http://mail.python.org/mailman/listinfo/python-list