Hi Tatiana,

2007/1/22, Tatiana Lloret Iglesias <[EMAIL PROTECTED]>:
i've realized that for each link, i spend most of the time in the following
perl script

foreach my $url (@lines){ -- I READ MY 1-ROW URL FILE
        $contador=2;
        $test=0;
        while(!$test){

            $browser2->get($url);
            $content = $browser2->content();

--IN THESE 2 STEPS I SPEND 6 SECONDS for a 86 kb html, Is it ok? Can i
perform these 2 steps faster?


Are you using the domain name or the ip address in link (e.g.
http://www.google.com/ or http://1.2.3.4)? If you are using the first,
perl will first contact your DNS server or cache, and then connect and
retrieve the contents you want. If you are not using a DNS cache, you
can build it using Net::DNS and Memoize for caching.

Check the example:

<code>
#!env perl

use strict;
use warnings;

use Benchmark::Timer;
use Carp;
use Memoize;
use Net::DNS;

# used by get_ip_from_hostname
my $resolver = Net::DNS::Resolver->new;

sub get_ip_from_hostname {
   my ($hostname) = @_;
   my $query = $resolver->search($hostname);
   if ($query) {
       foreach my $rr ( $query->answer ) {
           next unless $rr->type eq 'A';
           return $rr->address;
       }
   }
   else {
       croak "Query failed: ", $resolver->errorstring;
   }
}

my $t = Benchmark::Timer->new();

for ( 1 .. 1000 ) {
   $t->start('get_ip_from_hostname without memoize');
   my $ip = get_ip_from_hostname("www.google.com");
   $t->stop('get_ip_from_hostname without memoize');
}
print $t->report();

$t->reset();

memoize('get_ip_from_hostname');
for ( 1 .. 1000 ) {
   $t->start('get_ip_from_hostname memoize');
   my $ip = get_ip_from_hostname("www.google.com");
   $t->stop('get_ip_from_hostname memoize');
}

print $t->report();

</code>

HTH!

--
Igor Sutton Lopes <[EMAIL PROTECTED]>

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to