On Feb 12, 9:20 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > HTML: htmllib and HTMLParser (both in the Python library), > BeautifulSoup (again GIYF) > > XML: xml.* in the Python library. ElementTree (recommended) is > included in Python 2.5; use xml.etree.cElementTree. > > The source of HTMLParser and xmllib use regular expressions for > parsing out the data. htmllib calls sgmllib at the begining of it's > code--sgmllib starts off with a bunch of regular expressions used to > parse data. So the only real difference there I see is that someone > saved me the work of writing them ;0). I haven't looked at the source > for Beautiful Soup, though I have the sneaking suspicion that most > processing of html/xml is all based on regex's.
That's right. Those modules use regexes. You don't. You call functions & classes in the modules. Someone has written those modules and tested them and documented them and they've had a fair old thrashing by quite a few people over the years -- it may be the only difference in your way of thinking but it's quite a large difference from you opening up the re docs and getting stuck in single-handedly :-) -- http://mail.python.org/mailman/listinfo/python-list