On 07/01/13 14:48, Ed Owens wrote: [...]
parser = HTMLParser(formatter.AbstractFormatter( formatter.DumbWriter(cStringIO.StringIO())))
HTMLParser is from htmllib. I'm having trouble finding clear documentation for what the functions that are on the 'parser =' line do and return. The three modules (htmllib, formatter, & cStringIO are all imported, but I can't seem to find much info on how they work and what they do. What this actually does and what it produces is completely obscure to me. Any help would be appreciated. Any links to clear documentation and examples?
I'll start with the easiest: cStringIO, and it's slower cousin StringIO, are modules for creating fake in-memory file-like objects. Basically they create an object that holds a string in memory but behaves as if it were a file object. http://docs.python.org/2/library/stringio.html The formatter module is used to produce an object that you can use for creating formatted text. It's quite abstract, and to be honest I have never used it and don't know how it works. http://docs.python.org/2/library/formatter.html The htmllib module is a library for parsing HTML code. Essentially, you use it to read the content of webpages (.html files). http://docs.python.org/2/library/htmllib.html Unfortunately, there is not a lot of documentation for the htmllib.HTMLParser object, so I can't help you with that. -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor