My guess is the first line is the key optimization, at least if
sum($im==0) >> sum($im!=0) since that reduces workload in the rest of the code. --Chris On 8/31/2011 12:46 AM, Derek Lamb wrote:
Well, after all that I'm not so sure the effort was worth it! I just happened upon the code below in the same large program into which I inserted my original run-length-encoding algorithm. This code goes nearly as fast (0.52 seconds vs 0.46 seconds, 13% slower), produces identical results, and is much more readable (and is code I can't take credit for): my $im_xy = whichND($im != 0); my $im_vals = $im->indexND($im_xy)->sever; for my $id($im_vals->uniq->list) { $coord = $im_xy->(:,which($im_vals==$id))->qsortvec; } Derek On Aug 22, 2011, at 3:42 PM, Chris Marshall wrote:Neat, Derek! Of course I couldn't resist trying to tweak. I managed to clean up the 2-D index calculation by using one2nd() with the thought of going to an N-D version at some point. Then using where instead of the which test seemed to simplify the paring down of the valid RLE entries. Finally, I noticed that the index generating ops inside the for loop are equivalent to some simple array-wise operations. I didn't come up with a good way to generalize from 2-D to N-D for the dimensionality of $im. Nor was I able to proceed further with $coord (although I speculate that range might be used...). Anyway, here is the modified code: use PDL::LiteF; my $im = pdl q[ [ 0 0 1 1 0 0] [ 2 0 0 1 0 0] [ 2 2 0 0 0 3] [ 0 0 0 0 3 0] ]; my $vals = $im->flat->qsort; my $vind = $im->flat->qsorti; #because of the qsorts, similarly-labeled pixels of interest don't even have to be contiguous # apparently whichND has this helper routine exposed # my ($x,$y) = $im->one2nd($vind); my ($v_r_a,$v_r_b) = rle($vals); #run-length-encode the # the run-length is>0 for all valid entries so a where should work for this # ($v_r_a,$v_r_b) = where $v_r_a, $v_r_b, $v_r_a != 0; my $vra_cumsum = cumusumover($v_r_a); my ($val,$coord); # the selection operations could be parallelized but we would # need "ragged arrays" to handle the $coord. I wonder if range # could be used? # # $val = $v_r_b->copy; # $end = $vra_cumusum - 1; # $start = $vra_cumusum->rotate(1); # $start((0)) .= 0; # for my $i ( 0..$v_r_b->nelem-1 ) { $val = $v_r_b->(($i)); my ($start,$end); $start = $i ? $vra_cumsum(($i-1)) : 0; #trinary op needed for the first element because of how rle() does counting. $end = $vra_cumsum(($i)) - 1; $coord = $x($start:$end)->glue(1,$y($start:$end))->transpose->qsortvec; } Cheers, Chris On Mon, Aug 22, 2011 at 4:31 PM, Derek Lamb<[email protected]> wrote:The following code fragment is representative of something I am doing a lot of these days: $im = rfits('im.fits'); for $val ($im->uniq->list){ next unless $val; $coord = whichND($im==$val); ##do some stuff with $im and $coord and $val } where $im is a 1024 x 1024 image that is mostly zeroes, but has about 2500 regions of positive AND negative integers, like this but obviously much bigger: [ [ 0 0 1 1 0 0] [ 2 0 0 1 0 0] [ 2 2 0 0 0 3] [ 0 0 0 0 3 0] ] Many of those regions are small, only a few pixels, but a few are several hundred pixels big. The test-for-equality inside the whichND is killing me on speed: the above code fragment takes nearly 30 seconds to execute, not counting the rfits. I thought maybe I could speed things up by only going through the image once, and creating a %hash whose keys were the integers (1,2,3 etc in the example above) and the values were a list of coordinates, like this: $ndc = ndcoords($im)->clump(1,2); for $i(0..$ndc->dim(1)-1){ ($cx,$cy) = $ndc(:,($i))->list; next unless $im($cx,$cy); push @{$hash{$im($cx,$cy)}},$cx,$cy; } But that is even slower, around 40 seconds. Finally, I came up with this method involving a few stages of indirection: ###start use PDL::Math; #not included in PDL::LiteF, needed for floor() my $vals = $im->flat->qsort; my $vind = $im->flat->qsorti; #because of the qsorts, similarly-labeled pixels of interest don't even have to be contiguous my $x=$vind % $im->dim(0); my $y=floor($vind/$im->dim(0)); my ($v_r_a,$v_r_b)=rle($vals); #run-length-encode the #rle says only the elements up to the first 0 in $v_r_a should be considered, so we pare those down $v_r_b=$v_r_b(0:which($v_r_a==0)->(0)-1); $v_r_a=$v_r_a(0:which($v_r_a==0)->(0)-1); my $vra_cumsum = cumusumover($v_r_a); my ($val,$coord); for my $i(0..$v_r_b->nelem-1){ $val = $v_r_b->(($i)); next unless $val; my ($start,$end); $end = $vra_cumsum(($i))-1; $start = $i ? $vra_cumsum(($i-1)) : 0; #trinary op needed for the first element because of how rle() does counting. $coord = $x($start:$end)->glue(1,$y($start:$end))->transpose->qsortvec; } ###end The qsortvec at the end there is probably extraneous, but was useful for comparing the output of this method with my original. Notice there is no check for equality inside the loop (or anywhere at all except the $v_r_a and _b pare-down, and I suppose implicitly in the qsorts). Everything is accomplished using slices. This code fragment takes less than 0.5 seconds, a factor of 60 improvement in speed. Obviously this will only work for sparse images like my sample above. I haven't been careful about generalizing it to more than 2 dimensions, but that shouldn't be too hard. I'm posting this in the PDL Cookbook on the wiki. If you think it's useful let me know and I'll add it to the distribution, calling it which_all or something like that. The trick will be how to return ALL of the instances of $coord to the user--maybe the hash implementation I tried earlier will work, but I'm open to other suggestions. Derek
_______________________________________________ Perldl mailing list [email protected] http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
