Thorsten Kampe wrote: > For simple things like that "BeautifulSoup" might be overkill.
[HTMLParser example] I've used SGMLParser with some success before, although the SAX-style processing is objectionable to many people. One alternative is to use libxml2dom [1] and to parse documents as HTML: import libxml2dom, urllib url = 'http://www.python.org' doc = libxml2dom.parse(urllib.urlopen(url), html=1) anchors = doc.xpath("//a") Currently, the parseURI function in libxml2dom doesn't do HTML parsing, mostly because I haven't yet figured out what combination of parsing options have to be set to make it happen, but a combination of urllib and libxml2dom should perform adequately. In the above example, you'd process the nodes in the anchors list to get the desired results. Paul [1] http://www.python.org/pypi/libxml2dom -- http://mail.python.org/mailman/listinfo/python-list