Wells Oliver schrieb:
Writing a class which essentially spiders a site and saves the files locally. On a URLError exception, it sleeps for a second and tries again (on 404 it just moves on). The relevant bit of code, including the offending method:

class Handler(threading.Thread):
        def __init__(self, url):
                threading.Thread.__init__(self)
                self.url = url

        def save(self, uri, location):
                try:
                        handler = urllib2.urlopen(uri)
                except urllib2.HTTPError, e:
                        if e.code == 404:
                                return
                        else:
                                print "retrying %s (HTTPError)" % uri
                                time.sleep(1)
                                self.save(uri, location)
                except urllib2.URLError, e:
                        print "retrying %s" % uri
                        time.sleep(1)
                        self.save(uri, location)

                if not os.path.exists(os.path.dirname(location)):
                        os.makedirs(os.path.dirname(location))

                file = open(location, "w")
                file.write(handler.read())
                file.close()

...

But what I am seeing is that after a retry (on catching a URLError exception), I see bunches of "UnboundLocalError: local variable 'handler' referenced before assignment" errors on line 38, which is the "file.write(handler.read())" line..

Your code defines the name handler only if the urllib2.urlopen is successful. But you try later to access it uncoditionally, and of course that fails.

You need to put the file-stuff after the urlopen, inside the try-except.

Also note that python has no tail-recursion-optimization, so your method will recurse and at some point exhaust the stack if there are many errors.

You should consider writing it rather as while-loop, with breaking out of it when the page could be fetched.

Diez
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to