G'day,

I've got a problem with a script I'm writing which will run under mod_perl. It
accepts a list of urls and a list of strings and returns a list of which urls 
contain which strings. That's not complicated and I got it working fairly 
quickly without any CGI involved.

Then I tried to convert it to a web page. It takes two variables, PAGES and
STRINGS. I'm finding that no matches are occurring where they were before. I 
found the http://perl.apache.org/dist/mod_perl_traps.html document which 
talks about regexps only being compiled once, and I've tried placing eval q{ 
} around various parts of the loops as it suggests but to no effect.

Can anyone see why this wouldn't be working?

I know this script uses a very inefficient way of doing things, but "it's 
fast enough' and it's simple.

I've spent a couple of hours looking through the various docs but I'm not 
sure exactly what it is I'm looking for.

Thanks,
Len

**************

#!/usr/bin/perl -w

# search for each of a number of strings in a number of web pages

use CGI;
require LWP::UserAgent;

$q = new CGI;

print $q->header(-expires=>'-1d');
print <<EOH;
<html>
<title>Search results</title>
<body bgcolor=ffffff>
<h1>Search results</h1>
EOH

$textstr = $q->param('STRINGS');
$pages = $q->param('PAGES');

my $i = 0;
foreach $line (split /^/, $textstr) {
    chomp $line;
    $strings[$i] = $line;
    $i++;
}

$ua = new LWP::UserAgent;

foreach $line (split /^/, $pages) {
    chomp $line;
    $request = new HTTP::Request(GET => $line);
    print "Loading $line<br>";
    $response = $ua->request($request);
    if ($response->is_success) {
        $content{$line} = $response->content;
    } else {
        print "<b>Error: $line".$response->status_line."</b><br>";
    }
}

print "<br>Searching<br>";

foreach $page (keys %content) {
    print $page."<br>";
    for ($i=0; $i <= $#strings; $i++) {
        # \Q deals with () in pattern
        if ($content{$page} =~ /\Q$strings[$i]/) {
            print '<blockquote>'.$strings[$i].' found</blockquote>';
        }       
    } 
    
} 

print <<EOF;
</body>
</html>
EOF

Reply via email to