Re: Regular Expressions

John Machin Mon, 12 Feb 2007 03:21:13 -0800

On Feb 12, 9:20 pm, "[EMAIL PROTECTED]"
<[EMAIL PROTECTED]> wrote:
> HTML: htmllib and HTMLParser (both in the Python library),
> BeautifulSoup (again GIYF)
>
> XML: xml.* in the Python library. ElementTree (recommended) is
> included in Python 2.5; use xml.etree.cElementTree.
>
> The source of HTMLParser and xmllib use regular expressions for
> parsing out the data. htmllib calls sgmllib at the begining of it's
> code--sgmllib starts off with a bunch of regular expressions used to
> parse data. So the only real difference there I see is that someone
> saved me the work of writing them ;0). I haven't looked at the source
> for Beautiful Soup, though I have the sneaking suspicion that most
> processing of html/xml is all based on regex's.


That's right. Those modules use regexes. You don't. You call functions
& classes in the modules.

Someone has written those modules and tested them and documented them
and they've had a fair old thrashing by quite a few people over the
years -- it may be the only difference in your way of thinking but
it's quite a large difference from you opening up the re docs and
getting stuck in single-handedly :-)

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expressions

Reply via email to