Re: How to avoid using the slow array subscripting operator?

Xi Liu Sun, 29 May 2011 19:44:50 -0700

I guess I should provide some real code.
below is a tiny module I grabbed from my project which is not that privacy
but sufficient for discussion.


#/usr/bin/perl
use warnings;
use 5.010;
use Carp;
use DB_File;

{
    my %sum_cache;
    my %deviation_cache;
    tie %sum_cache => 'DB_File', "sum_cache_tmp", O_RDWR|O_CREAT, 0666 or
croak "can not create cache file!";
    tie %deviation_cache => 'DB_File', "deviation_cache_tmp",
O_RDWR|O_CREAT, 0666 or croak "can not create cache file!";


    sub get_sum
    {
my ($i, $n, $opens_ref) = @_;
my $key = join ',', $i, $n;
unless(exists $sum_cache{$key})
{
    my $inner_key = join ',', $i - 1, $n;
    my $sum = 0;
    if(exists $sum_cache{$inner_key})
    {
        $sum = $sum_cache{$inner_key} - ${$opens_ref}[$i - $n] +
${$opens_ref}[$i];
    }else
    {
$sum += $_ foreach @{$opens_ref}[$i - $n + 1 .. $i];
    }
    $sum_cache{$key} = $sum;
}
return $sum_cache{$key};
    }

    sub get_deviation
    {
my ($i, $n, $mean, $opens_ref) = @_;
my $key = join ',', $i, $n;
unless(exists $deviation_cache{$key})
{
    my $sum = 0;
    $sum += ($_ - $mean) ** 2 foreach @{$opens_ref}[$i - $n + 1 .. $i];
    $deviation_cache{$key} = sqrt($sum / ($n - 1));
}
return $deviation_cache{$key};
    }

    sub cal_using_simple_scheme
    {
my($n1, $n2, $n3, $klines_ref) = @_;
my @opens = map {$_->{open}} @$klines_ref;
croak "Wrong argnument. Should all be positive values" if ($n1 <= 0 || $n2
<= 0 || $n3 <= 0);
return [(0) x @opens] if $n2 <= 1;
my $n = $n1 > $n2 ? $n1 : $n2;
my @res;
my @upper_bound;
my @lower_bound;
 for my $i (0 .. $#opens)
{
    if($i < $n - 1)
    {
push @upper_bound, undef;
push @lower_bound, undef;
next;
    }

    my $sum1 = get_sum($i, $n1, \@opens);
    my $mid = $sum1 / $n1;

    my $sum2 = get_sum($i, $n2, \@opens);
    my $mean = $sum2 / $n2;
    my $std_deviation = get_deviation($i, $n2, $mean, \@opens);

    push @upper_bound, $mid + $n3 * $std_deviation;
    push @lower_bound, $mid - $n3 * $std_deviation;
}
for my $i (0 .. $#opens)
{
    if(!defined($upper_bound[$i]))
    {
push @res, 0;
next;
    }
    given( $opens[$i])
    {
when($_ > $upper_bound[$i]) {push @res, 1; break}
when($_ < $lower_bound[$i]) {push @res, -1; break}
default {push @res, $res[-1]; break}
    }
}
return \@res;
    }

}

1;

and it's called like this:
my $klines_ref;
for my $n1 (1..200)
{
for my $n2 (1..200)
{
for my $n3 (1,3,5,7)
{
my $res = cal_using_simple_scheme($n1, $n2, $n3, $klines_ref);
#using the result to do something, omitted here.
}
}
}

$klines_ref is a reference to an array which contains at least 60000+
elements.
the module is simple, the main time consuming part is calculating sum and
standard deviation part. I cached the results for both sum and
std_deviation. but since the input array is so big that during the first
call of the cal_using_stupid_scheme the program crashes because of out of
memory. So I tie the two hashes to local files, and as I predicted, the
speed is unacceptable. maybe this is the kind of problem perl is good at?
I still use index because I really don't know how to avoid it. would some
body point out the wrong and not perlish part of the program above?
Thank you all.

Re: How to avoid using the slow array subscripting operator?

Reply via email to