Gurunath - The proper way to handle this is to apply the "stats" method to the result of the "diagonal" slice. It is fast, succinct, and statistically correct:
my ($mean, $stdev) = $my_huge_matrix->diagonal(0, 1)->stats; In slightly more detail from the PDL shell (ok, from prima-repl, but who's keeping score?): > $matrix = pdl q[ 2 2 2 ; 3 1 0 ; 2 5 1 ] > $diagonal = $matrix->diagonal(0, 1) > print "Matrix is $matrix and diagonal is $diagonal" Matrix is [ [2 2 2] [3 1 0] [2 5 1] ] and diagonal is [2 1 1] > ($mean, $rms) = $diagonal->stats > print "Diagonal's mean is $mean and diagonal's standard deviation is $rms" Diagonal's mean is 1.33333333333333 and diagonal's standard deviation is 0.577350269189626 David P. S. There is something important to note about the $rms value returned as the second argument by stats. A calculation of the standard deviation would begin by summing the squares of the differences between the mean and the values: my $sq_diff_sum = sum( ($piddle - $piddle->average)**2 ); A navie calculation would then divide by the *number of elements*, and take the square root: my $st_dev = sqrt( $sq_diff_sum / $sq_diff_sum->nelem); That is the so-called population standard deviation, which assumes that you got the average value from the *population*, not the *sample*. For large data sets, the discrepancy is not important. For small data sets, this is only accurate if you have the entire population of interest. If you only have a subset of the population, then the statistically correct value comes by dividing instead by the *number of elements minus 1*, i.e. my $st_dev = sqrt( $sq_diff_sum / ($sq_diff_sum->nelem - 1) ); In this case---finding the mean and standard deviation of the elements along the diagonal---either method is acceptable, and you should report which value you calculated when you show your results. For sample-based statistical calculations, where you only take a sample of your whole population, the latter calculation is correct. I say all of this just to point out that the second return value of the stats and statsover methods is the sample RMS deviation from the mean, i.e. it divides by N-1. (The docs call this the population RMS deviation for reasons not clear to me.) If you want to use the population RMS, you should get the 7th argument, like so: # Perl array slice notation my ($mean, $rms) = ($diagonal->stats)[0, 6]; # Using undef to discard unwanted returned values: my ($mean, undef, undef, undef, undef, undef, $rms) = $diagonal->stats; -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." -- Brian Kernighan
_______________________________________________ Perldl mailing list [email protected] http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
