On Wed, Jan 6, 2010 at 1:59 PM, Tim Chase <python.l...@tim.thechases.com>wrote:
> Victor Subervi wrote: > >> On Wed, Jan 6, 2010 at 1:27 PM, Tim Chase <python.l...@tim.thechases.com >> >wrote: >> >> But if you're using it on HTML form text, regexps are usually the wrong >>> tool, and you should be using an HTML parser (such as BeautifulSoup) that >>> knows how to handle odd text and escapings better and more robustly than >>> regexps will >>> >> >> I have an automatically generated HTML form from which I need to extract >> data to the script which this form calls (to which the information is >> sent). >> I believe BeautifulSoup is geared to scraping pages that exist permanently >> on the web. By the time BeautifulSoup was called, this page would be gone. >> > > BeautifulSoup takes string data fed to it, and builds a structure that can > be neatly navigated. That string data can come from a web page, from a > disk, or even a serial port, a random-character-generator, or just from HTML > that's built up in memory and never sees a network or a disk. It's worth > reading its documentation[1] and trying its examples to get familiar with > it. > k. Thanks. beno
-- http://mail.python.org/mailman/listinfo/python-list