Not sure why the sort was so slow, but on another note, t_test threads, meaning you can have the t_tests performed in PDL loop instead of perl loop, which will be much faster, and you can sort the results using PDL qsort or qsorti for the indexes.
Here's an example if you have 3 t tests to perform, each test is between two lists with 10 observations each, pdl> p $a = sequence 10, 3 [ [ 0 1 2 3 4 5 6 7 8 9] [10 11 12 13 14 15 16 17 18 19] [20 21 22 23 24 25 26 27 28 29] ] pdl> p $b = ushort random( 10, 3 ) * 10 [ [2 1 2 9 4 7 7 1 6 1] [3 8 6 4 3 6 2 2 5 6] [9 3 8 3 0 9 5 2 9 7] ] pdl> ($t, $df) = t_test($a, $b) pdl> p $t [0.36983525 8.6965655 13.324451] pdl> p $df [18 18 18] Best, Maggie On Thu, Feb 2, 2012 at 11:01 AM, Adam Russell <[email protected]> wrote: > This thread reminded me of a pdl sort issue I was having. > I hope I not too off topic... > So, I am using pdl from within a larger body of Perl code. > I am using PDL::Stats to perform a t_test on a bunch of data I store in > pdls. > Once I get done with this I sort the t-statistics and then throw away the > pdls. > I noticed my code was running somewhat slower than I would have thought it > shoud. > So, I ran the code under the NYT profiler. My code was spending the > majority of its time (~80% of total execution time!) > on the last line below(yeah, I uncreatively named the pdls "pdls"): > > foreach my $dim_n (0..$self->{dimension}-1){ > my ($t, $df) = t_test($pdls[$dim_n][$cat_n][0], > $pdls[$dim_n][$cat_n][1]); > $t_stats[$dim_n]=$t->abs->sclr; > } > @s_t_stats=sort {$b <=> $a} @t_stats; > > Here is what profiler output for that line looks like: > @s_t_stats=sort {$b <=> $a} @t_stats; > # spent 5.18s making 4 calls to CORE:sort, avg 1.29s/call > # spent 3.43s making 91451 calls to PDL::string, avg 38µs/call > # spent 939ms making 91451 calls to PDL::spaceship, avg 10µs/call > # spent 44µs making 15 calls to PDL::DESTROY, avg 3µs/call > > 1.29 seconds for each call to sort is very long time! From the calls made > when that line is executed it seems that for some reason > it is doing some sort of string conversion? But why? Surely $t->abs->sclr > is returning a numeric, right? > The code currently takes about 10 seconds to run. If I take care of this > sort problem I could probably get runs in > under 3 seconds. > > Any advice on why this sort is so slow? > Best Regards, > Adam > > > > Date: Thu, 2 Feb 2012 07:07:24 -0600 > > From: [email protected] > > To: [email protected] > > CC: [email protected] > > Subject: Re: [Perldl] how to sort a piddle ??? > > > > > I agree with Matt that you are probably looking for `qsort`. > > > > As to what > > > > @e = pdl(3,2,6,4,8,6); > > @r = sort{$a <=> $b} @e; > > > > is doing, its working perfectly; its just not doing what you mean. @e > > is a one element Perl-level array, its one element is a PDL object. > > Any sort on a one element array will return the same order, what else > > could it do. > > > > You have to remember that a PDL object is just another scalar in > > Perl's eyes, as are all objects. > > > > Here is another example > > > > @e = (pdl(3,2,6,4,8,6), pdl(5,6,2,1)); > > @r = sort{$a <=> $b} @e; > > > > Here @e has two PDL object. When you sort objects numerically ( using > > <=> ), what you will actually sort on is not their contents, but their > > address in memory. > > > > The take-away message is this: PDL overloads many of the Perl > > operators, and it can feel like PDL and Perl are fully integrated, but > > in truth a PDL object is still an object, that is a scalar reference > > with methods and overloads. PDL tries to Do What You Mean when it can, > > this is not one of those times. > > > _______________________________________________ > Perldl mailing list > [email protected] > http://mailman.jach.hawaii.edu/mailman/listinfo/perldl > >
_______________________________________________ Perldl mailing list [email protected] http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
