Re: [Perldl] speeding up whichND

chm Wed, 31 Aug 2011 03:58:06 -0700

My guess is the first line is
the key optimization, at least if


  sum($im==0) >> sum($im!=0)

since that reduces workload in the
rest of the code.

--Chris

On 8/31/2011 12:46 AM, Derek Lamb wrote:

Well, after all that I'm not so sure the effort was worth it!  I just happened 
upon the code below in the same large program into which I inserted my original 
run-length-encoding algorithm.  This code goes nearly as fast (0.52 seconds vs 
0.46 seconds, 13% slower), produces identical results, and is much more 
readable (and is code I can't take credit for):

my $im_xy = whichND($im != 0);
my $im_vals = $im->indexND($im_xy)->sever;
for my $id($im_vals->uniq->list) {
$coord = $im_xy->(:,which($im_vals==$id))->qsortvec;
}

Derek

On Aug 22, 2011, at 3:42 PM, Chris Marshall wrote:

Neat, Derek!

Of course I couldn't resist trying to tweak.  I managed to
clean up the 2-D index calculation by using one2nd() with
the thought of going to an N-D version at some point.

Then using where instead of the which test seemed to
simplify the paring down of the valid RLE entries.

Finally, I noticed that the index generating ops inside
the for loop are equivalent to some simple array-wise
operations.

I didn't come up with a good way to generalize from 2-D
to N-D for the dimensionality of $im.  Nor was I able to
proceed further with $coord (although I speculate that
range might be used...).

Anyway, here is the modified code:

use PDL::LiteF;

my $im = pdl q[
               [ 0 0 1 1 0 0]
               [ 2 0 0 1 0 0]
               [ 2 2 0 0 0 3]
               [ 0 0 0 0 3 0]
             ];

my $vals = $im->flat->qsort;
my $vind = $im->flat->qsorti;  #because of the qsorts,
similarly-labeled pixels of interest don't even have to be contiguous

# apparently whichND has this helper routine exposed
#
my ($x,$y) = $im->one2nd($vind);

my ($v_r_a,$v_r_b) = rle($vals);  #run-length-encode the

# the run-length is>0 for all valid entries so a where should work for this
#
($v_r_a,$v_r_b) = where $v_r_a, $v_r_b, $v_r_a != 0;

my $vra_cumsum = cumusumover($v_r_a);

my ($val,$coord);

# the selection operations could be parallelized but we would
# need "ragged arrays" to handle the $coord.  I wonder if range
# could be used?
#
#  $val = $v_r_b->copy;
#  $end = $vra_cumusum - 1;
#  $start = $vra_cumusum->rotate(1);
#  $start((0)) .= 0;
#
for my $i ( 0..$v_r_b->nelem-1 ) {
   $val = $v_r_b->(($i));
   my ($start,$end);
   $start = $i ? $vra_cumsum(($i-1)) : 0;  #trinary op needed for the
first element because of how rle() does counting.
   $end = $vra_cumsum(($i)) - 1;
   $coord = $x($start:$end)->glue(1,$y($start:$end))->transpose->qsortvec;
}

Cheers,
Chris


On Mon, Aug 22, 2011 at 4:31 PM, Derek Lamb<[email protected]>  wrote:

The following code fragment is representative of something I am doing a lot of 
these days:

$im = rfits('im.fits');
for $val ($im->uniq->list){
        next unless $val;
        $coord = whichND($im==$val);
        ##do some stuff with $im and $coord and $val
}

where $im is a 1024 x 1024 image that is mostly zeroes, but has about 2500 
regions of positive AND negative integers, like this but obviously much bigger:
[
[ 0 0 1 1 0 0]
[ 2 0 0 1 0 0]
[ 2 2 0 0 0 3]
[ 0 0 0 0 3 0]
]

Many of those regions are small, only a few pixels, but a few are several 
hundred pixels big.  The test-for-equality inside the whichND is killing me on 
speed: the above code fragment takes nearly 30 seconds to execute, not counting 
the rfits.  I thought maybe I could speed things up by only going through the 
image once, and creating a %hash whose keys were the integers (1,2,3 etc in the 
example above) and the values were a list of coordinates, like this:

$ndc = ndcoords($im)->clump(1,2);
for $i(0..$ndc->dim(1)-1){
   ($cx,$cy) = $ndc(:,($i))->list;
   next unless $im($cx,$cy);
   push @{$hash{$im($cx,$cy)}},$cx,$cy;
}

But that is even slower, around 40 seconds.

Finally, I came up with this method involving a few stages of indirection:

###start
use PDL::Math; #not included in PDL::LiteF, needed for floor()
my $vals = $im->flat->qsort;
my $vind = $im->flat->qsorti;  #because of the qsorts, similarly-labeled pixels 
of interest don't even have to be contiguous

my $x=$vind % $im->dim(0);
my $y=floor($vind/$im->dim(0));

my ($v_r_a,$v_r_b)=rle($vals);  #run-length-encode the
#rle says only the elements up to the first 0 in $v_r_a should be considered, 
so we pare those down
$v_r_b=$v_r_b(0:which($v_r_a==0)->(0)-1);
$v_r_a=$v_r_a(0:which($v_r_a==0)->(0)-1);

my $vra_cumsum = cumusumover($v_r_a);

my ($val,$coord);
for my $i(0..$v_r_b->nelem-1){
    $val = $v_r_b->(($i));
    next unless $val;
    my ($start,$end);
    $end = $vra_cumsum(($i))-1;
    $start = $i ? $vra_cumsum(($i-1)) : 0;  #trinary op needed for the first 
element because of how rle() does counting.
    $coord = $x($start:$end)->glue(1,$y($start:$end))->transpose->qsortvec;
}
###end

The qsortvec at the end there is probably extraneous, but was useful for 
comparing the output of this method with my original.  Notice there is no check 
for equality inside the loop (or anywhere at all except the $v_r_a and _b 
pare-down, and I suppose implicitly in the qsorts).  Everything is accomplished 
using slices.  This code fragment takes less than 0.5 seconds, a factor of 60 
improvement in speed. Obviously this will only work for sparse images like my 
sample above.  I haven't been careful about generalizing it to more than 2 
dimensions, but that shouldn't be too hard.

I'm posting this in the PDL Cookbook on the wiki.  If you think it's useful let 
me know and I'll add it to the distribution, calling it which_all or something 
like that.  The trick will be how to return ALL of the instances of $coord to 
the user--maybe the hash implementation I tried earlier will work, but I'm open 
to other suggestions.

Derek


_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Re: [Perldl] speeding up whichND

Reply via email to