I have been working my way through Chun's book /Core Python Applications.
/In chapter 9 he has a web crawler program that essentially copies all
the files from a web site by finding and downloading the links on that
domain.
One of the classes has a procedure definition, and I'm having trouble
finding documentation for the functions. The code is:
def parse_links(self):
'Parse out the links found in downloaded HTML file'
f = open(self.file, 'r')
data = f.read()
f.close()
parser = HTMLParser(formatter.AbstractFormatter(
formatter.DumbWriter(cStringIO.StringIO())))
parser.feed(data)
parser.close()
return parser.anchorlist
HTMLParser is from htmllib.
I'm having trouble finding clear documentation for what the functions
that are on the 'parser =' line do and return. The three modules
(htmllib, formatter, & cStringIO are all imported, but I can't seem to
find much info on how they work and what they do. What this actually
does and what it produces is completely obscure to me.
Any help would be appreciated. Any links to clear documentation and
examples?
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor