THX! I always knew it's my n00b scripting that caused the problem. It
now down to 4 lines inside while loop, and I'm flying! The processing
down from ~15 minutes to 4 secs! But, for longer date range (2 weeks
time limit) it clocked at 25 wallclock secs (with processing ~350k mysql
return entries). Any more way to process this beast any faster? Any good
reading or reference about this? Just afraid, if this rolled to
production level, the report for months activities will take minutes to
make!

<snipped>
        $query = "SELECT src, sent, rcvd FROM no_name WHERE start_date
>= '$fromdate' AND start_date <= '$todate';";
        $sth = $dbh->prepare($query);
        $sth -> execute() || &errorhere("SQL Error: $DBI::errstr");
        #print "$query<br>";

        my %src = ();
        my ($totalsent, $totalrcvd);

        while(@row = $sth->fetchrow_array){
                my $key = $row[0];
                $src{$key} += $row[2];
                $totalsent = $totalsent + $row[1];
                $totalrcvd = $totalrcvd + $row[2];
        }
</snipped>
Cheers,
 
Toddy Prawiraharjo
 


-----Original Message-----
From: Rob Dixon [mailto:[EMAIL PROTECTED] 
Sent: Friday, 29 September 2006 10:04 AM
To: beginners@perl.org
Subject: Re: Querying very big database


Toddy Prawiraharjo wrote:
> Hello all,
> 
> I am not sure if i'm inside the correct mailing list, but it should be
> simple for you guys. Recently i started a simple perl program to catch
> all syslog from my netscreen firewall, and put them into a database,
so
> later can do some analysing on it (sort of AWStats). the database
itself
> is pretty big, with 600k entry (for 3 weeks of the running firewall)
> worth 80 megs in mysql. My question is, during the analysing my perl
> script giving very slow processing. I know the query to mysql itself
is
> pretty quick, less than 2 seconds to return about 40k to 60k result to
> perl to be analysed
> 
> Here it goes:
> <snipped>
>         $query = "SELECT src, rcvd FROM no_name WHERE start_date >=
> '$fromdate' AND start_date <= '$todate';";
>         $sth = $dbh->prepare($query);
>         $sth -> execute() || &errorhere("SQL Error: $DBI::errstr");
>         print "$query<br>";
> 
>         my %src = ();
>         my ($totalsent, $totalrcvd, $srcchecked);
> 
>         while(@row = $sth->fetchrow_array){
>                 $srcchecked = 0;
>                 while($source, $total = each (%src)){
>                         if ($source eq "$row[0]"){
>                                 $srcchecked = 1;
>                                 $src{$source} = $src{$source} +
$row[1];
>                         }
>                 }
>                 if ($srcchecked != 1){
>                         print "$row[0]<br>";
>                         $src{$row[0]} = $row[1];
>                 }
>                 $totalrcvd = $totalsent + $row[1];
>                 #$totalrcvd = $totalrcvd + $row[2];
>         }
> </snipped>
> 
> The while loop to do analysing on the data take more than 15 minutes,
> and that only to a query for 1 day long records($fromdate-$todate) 
> So, if I want perl to give me faster result, what's the solution? Did
I
> make fundamentally wrong approach?

You've missed the whole point of hashes, which is that they will access
a data
value directly from a key. What you've written is similar to

   my $i = 0;
   my $value;
   foreach my $v (@array) {
     if ($i++ == $n) {
       $value = $v;
     }
   }

instead of

   $value = $array[$n];

Try this while loop instead of the one you have and see if you get an
improvement:

   while (@row = $sth->fetchrow_array) {

     my $key = $row[0];

     print "$key<br>" unless exists $src{$key};
     $src{$key} += $row[1];

     $totalsent += $row[1];
     #$totalrcvd += $row[2];
   }

HTH,

Rob

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to