This has been asked at least as often as it has been anwsered and so far the
most flexible solution I have found, not the simplest though.

WWW::Mechanize<http://search.cpan.org/~petdance/WWW-Mechanize-1.34/lib/WWW/Mechanize.pm>

There are a lot of others out there and a lot of them have been build on top
of this little gem. If you are looking to scrape a major website like
hotmal, yahoo, imdb, gmail, etc, you will most likely find modules dedicated
to that website that do all the dirty work for you.

Be careful though the more specific the implementation the more likely it is
that a small layout change on the website can break your scrapping module.
But I assume that you figured that one out already.

Regards,

Rob Coops

On Wed, May 14, 2008 at 6:05 PM, Richard Lee <[EMAIL PROTECTED]> wrote:

> Hi guys again!
>
> I am sure this questions been around for while but I am not sure where to
> begin.
>
> I am trying to grep a html page given a URL and then extract some
> information from the source code.
> So something like
>
> open FH, "www.example.com/index.html | " , or die "no way : $!\n";
> @array = <FH>;
>
> my $code;
> while (@array) {
>       next if /bleh/;
>       if ( /^From: (.*)/ ) {
>           $code = $1;
>       }
> }
>
> You get the idea.. so anyway I did the search
> on google
>
> 'how to grep a web page source using perl'   -- no luck
> web perl modules   ---> reading
> http://www.perl.com/pub/a/2002/08/20/perlandlwp.html
>
> I guess the reason I wrote this out is to see if anyone else begining perl
> webpage can use my search or perhaps someone can tell me I am doing
> something stupid
> as perl and web seems to be pretty common operation but this is only way I
> know how to at this point.
>
> anyway, just sharing on this... and also look for feedback
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> http://learn.perl.org/
>
>
>

Reply via email to