On Fri, Jul 20, 2001 at 02:45:52PM +0200, Jos I. Boumans wrote:
> ### $res->content is the return value of a LWP request
> my @images = $res->content =~ |<img.+?src=\"?(.+?)\"|sig;

Rather than go the broken regex route (in the above regex the begin quote is
optional, the end quote isn't; it doesn't handle single quotes, or no
quotes; it matches '<img foosrc="') I would suggest using
HTML::SimpleLinkExtor.

    use HTML::SimpleLinkExtor;

    my $extor = HTML::SimpleLinkExtor->new()
    $extor->parse($res->content);

    foreach my $img ($extor->img) {
        # retrieve $img, store it locally
    }


I'm not trying to be insulting Jos, but simple regexes shouldn't be used to
parse HTML.


Michael
--
Administrator                      www.shoebox.net
Programmer, System Administrator   www.gallanttech.com
--

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to