> I have pulled the seach.html file as follows: > I went to link http://srch.overture.com) then search for word "help", > then I save the result as file named search.html > Then I wrote the script below to extract and find the URLs in this > saved web page (which is not working very well). > > Part II > I now want to pipe the saved file to the script from STDIN. > the URLs found should be printed to the command line, each on different line. > The script should be general enough so that it can be tested with > a different query, eg. "help them".
This will produce a list of URL's: #!/usr/local/bin/perl use HTML::SimpleLinkExtor; use LWP::Simple; $URL = "http://srch.overture.com/d/search/;$sessionid$LME54CYADCNCWCQCBGWAPUQ?type=hom e&tm=1&Keywords=help"; my %seen = (); my $parser = HTML::LinkExtor->new(undef, "$URL"); $parser->parse(get($URL))->eof; # offending line if not connected my @links = $parser->links; foreach my $linkarray (@links) { my @element = @$linkarray; shift @element; while (@element) { my ($attr_name, $attr_value) = splice(@element, 0, 2); push @LinksToProcess, "$attr_value"; $seen{$attr_value}++; } } for (@LinksToProcess) { print "$_\n"; } __END__ Gary -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]