I'm still working through Chun's "Core Python Applications". I got the
web crawler (Example 9-2) working after I found a ':' typing error. Now
I'm trying to convert that to a program that checks for broken links.
This is not in the book. The problem I'm having now is knowing whether
a link is working.
I've written an example that I hope illustrates my problem:
#!/usr/bin/env python
import urllib2
sites = ('http://www.catb.org', 'http://ons-sa.org', 'www.notasite.org')
for site in sites:
try:
page = urllib2.urlopen(site)
print page.geturl(), "didn't return error on open"
print 'Reported server is', page.info()['Server']
except:
print site, 'generated an error on open'
try:
page.close()
print site, 'successfully closed'
except:
print site, 'generated error on close'
Site 1 is alive, the other two dead. Yet this code only returns an
error on site three. Notice that I checked for a redirection (I think)
of the site if it opened, and that didn't help with site two.
Is there an unambiguous way to determine if a link has died -- knowing
nothing about the link in advance?
Ed
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor