Re: getfirst and re

Victor Subervi Wed, 06 Jan 2010 10:12:53 -0800

On Wed, Jan 6, 2010 at 1:59 PM, Tim Chase <[email protected]>wrote:


> Victor Subervi wrote:
>
>> On Wed, Jan 6, 2010 at 1:27 PM, Tim Chase <[email protected]
>> >wrote:
>>
>>  But if you're using it on HTML form text, regexps are usually the wrong
>>> tool, and you should be using an HTML parser (such as BeautifulSoup) that
>>> knows how to handle odd text and escapings better and more robustly than
>>> regexps will
>>>
>>
>> I have an automatically generated HTML form from which I need to extract
>> data to the script which this form calls (to which the information is
>> sent).
>> I believe BeautifulSoup is geared to scraping pages that exist permanently
>> on the web. By the time BeautifulSoup was called, this page would be gone.
>>
>
> BeautifulSoup takes string data fed to it, and builds a structure that can
> be neatly navigated.  That string data can come from a web page, from a
> disk, or even a serial port, a random-character-generator, or just from HTML
> that's built up in memory and never sees a network or a disk.  It's worth
> reading its documentation[1] and trying its examples to get familiar with
> it.
>

k. Thanks.
beno

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: getfirst and re

Reply via email to