I modified Peter J. Holzer's code to test both the case where the
$i is set to 1 (Holzer's case) and where $i is 0 (which forces
more assignments). I also added a straight assignment
statement to each set. The program and results are below.
This test was run three times on a Sun Ultra 5, with Solaris 8
and Perl 5.6.1. The first run was when I left for lunch -- I do
not know what else might have been running during that test.
The other tests were run while I was at my desk. Top showed
that the test program was getting 98+% of the processor during
those times. Each of the second and and third tests took about
22 minutes according to time.
Questions:
-Why does Benchmark show negative times?
-Why is there a several-fold timing difference between runs of
the same code?
-How can any test in the second set ($i set to 0) be faster than
a straight assignment without a preceding test?
I have not looked at the code for Benchmark, but it the only way
I can reconcile the output is:
1) The timer used by Benchmark uses wall clock time, not CPU
time, and there was some other process(es) competing with
the first test, inflating the times.
2) Benchmark computes times according to the following
simplified pseudo-code:
$total_time = 0;
while($loop_count-- > 0) {
$start_time = get_the_time_somehow();
do_a_test();
$end_time = get_the_time_somehow();
$total_time += $end_time - $start_time;
}
print "total_time=$total_time\n";
3) The "get_the_time_somehow()" routine used by Benchmark
is not getting the time in an atomic way. i.e., on rare occasions,
part of the time value is getting updated during retrieval of the
time. If the high-order portion is retrieved, first, then the
low-order portion may roll-over to zero, thus sometimes
understating the true elapsed time. If the low-order portion
were retrieved first, then the high-order part could be
incremented before retrieval, thus overstating the time. (My
system is running in 32-bit mode, maybe 64-bit mode would
not have this problem, but I have not tested it.)
4) Benchmark is thus unreliable for measuring trivial pieces of
code, such as that used in these tests. Increasing the loop
counter for the tests does not necessarily increase accuracy,
because the chance of getting an inaccurate time value is
increased as well.
Suggestions:
1) The "get_the_time_somehow()" routine needs to be examined
and changed to get an atomic time value. (I do not mean from
an atomic clock, just that the value be sampled as a whole at
some instant in time, not extracted in pieces.)
2) Benchmark could be changed to compute the time in a manner
similar to the following:
$counter = $loop_count;
$start_time1 = get_the_time_somehow();
while($counter-- > 0) {
do_a_test();
}
$end_time1 = get_the_time_somehow();
$start_time2 = get_the_time_somehow();
while($counter-- > 0) {
# measure loop overhead without the code to be tested
}
$end_time2 = get_the_time_somehow();
$total_time += ($end_time1 - $start_time1) - ($end_time2 - $start_time2);
print "total_time=$total_time\n";
Results:
To get back to the original question about the speed of the various
alternatives, I threw out the first set of tests, because I am assuming
that some other process was running in the background and inflating
the time estimates dramatically. Next in order to account for the
apparent negative bias of the timing errors, I used the larger of the
two second runs. The results are as follows:
$i = 1 (minimal assignments)
1) ||= do
2) post unless
3) ||=
3) pre unless
5) assign
6) ?:
The first four are in the range 4-6 seconds , probably equal within the
precision of the timing. The last two are 10 and 12 seconds. The first
four test, but do not assign. Given optimal code optimization, they
should all do exactly the same thing in this test. The fifth always
assigns, but does not test. The last and slowest always both tests
and assigns.
$i = (maximal assignments)
1) assign
2) post unless
3) ||= do
3) ||=
5) ?:
6) pre unless
The fastest is "assign" (but never test) at 12 seconds. All of the
rest always both test and assign. "Post unless" is 14 seconds. The
rest all took 16-18 seconds. It is difficult to prove a difference in the
last four or maybe five, given the wide variance of timings for these tests.
Conclusions:
1) These timings are not inconsistent with the expected behaviour of
the code generated by these varying code constructs.
2) Do not believe the timings from Benchmark for trivial code examples
such as these without carefully analyzing the results from different
runs (each run under identical conditions) to try to account for the
negative bias introduced by timing errors. Benchmark should be more
useful for testing longer code fragments, where the time to run each
test is much longer than the timing errors.
Jim White
--------------------Program--------------------
#!/usr/local/bin/perl -w
use strict;
use Benchmark qw/cmpthese/;
my $i = 1;
print "test with \$i set to 1\n";
cmpthese(10_000_000, {
'assign' => sub { $i = 1 },
'||=' => sub { $i ||= 1 },
'||= do' => sub { $i ||= do { 1 } },
'post unless' => sub { $i = 1 unless $i },
'pre unless' => sub { unless ($i) { $i = 1 } },
'?:' => sub { $i = $i ? $i : 1 },
}
);
print "\ntest with \$i set to 0\n";
$i = 0;
cmpthese(10_000_000, {
'assign' => sub { $i = 0 },
'||=' => sub { $i ||= 0 },
'||= do' => sub { $i ||= do { 0 } },
'post unless' => sub { $i = 0 unless $i },
'pre unless' => sub { unless ($i) { $i = 0 } },
'?:' => sub { $i = $i ? $i : 0 },
}
);
--------------------Output1--------------------
test with $i set to 1
Benchmark: timing 10000000 iterations of ?:, assign, post unless, pre unless, ||=,
||= do...
?:: 30 wallclock secs (30.77 usr + 0.00 sys = 30.77 CPU) @ 324991.88/s
(n=10000000)
assign: 51 wallclock secs (51.14 usr + 0.00 sys = 51.14 CPU) @ 195541.65/s
(n=10000000)
post unless: -4 wallclock secs (-4.62 usr + 0.00 sys = -4.62 CPU) @ -2164502.16/s
(n=10000000)
(warning: too few iterations for a reliable count)
pre unless: 22 wallclock secs (22.76 usr + 0.00 sys = 22.76 CPU) @ 439367.31/s
(n=10000000)
||=: 7 wallclock secs ( 7.44 usr + 0.00 sys = 7.44 CPU) @ 1344086.02/s
(n=10000000)
||= do: 11 wallclock secs (10.52 usr + 0.00 sys = 10.52 CPU) @ 950570.34/s
(n=10000000)
Rate post unless assign ?: pre unless ||= do ||=
post unless -2.16e+06/s -- -1207% -766% -593% -328% -261%
assign 195542/s -109% -- -40% -55% -79% -85%
?: 324992/s -115% 66% -- -26% -66% -76%
pre unless 439367/s -120% 125% 35% -- -54% -67%
||= do 950570/s -144% 386% 192% 116% -- -29%
||= 1344086/s -162% 587% 314% 206% 41% --
test with $i set to 0
Benchmark: timing 10000000 iterations of ?:, assign, post unless, pre unless, ||=,
||= do...
?:: 67 wallclock secs (67.43 usr + 0.00 sys = 67.43 CPU) @ 148301.94/s
(n=10000000)
assign: 51 wallclock secs (50.63 usr + 0.00 sys = 50.63 CPU) @ 197511.36/s
(n=10000000)
post unless: 26 wallclock secs (26.02 usr + 0.00 sys = 26.02 CPU) @ 384319.75/s
(n=10000000)
pre unless: 46 wallclock secs (44.48 usr + 0.00 sys = 44.48 CPU) @ 224820.14/s
(n=10000000)
||=: 48 wallclock secs (47.47 usr + 0.00 sys = 47.47 CPU) @ 210659.36/s
(n=10000000)
||= do: 68 wallclock secs (66.96 usr + 0.00 sys = 66.96 CPU) @ 149342.89/s
(n=10000000)
Rate ?: ||= do assign ||= pre unless post unless
?: 148302/s -- -1% -25% -30% -34% -61%
||= do 149343/s 1% -- -24% -29% -34% -61%
assign 197511/s 33% 32% -- -6% -12% -49%
||= 210659/s 42% 41% 7% -- -6% -45%
pre unless 224820/s 52% 51% 14% 7% -- -42%
post unless 384320/s 159% 157% 95% 82% 71% --
--------------------Output2--------------------
test with $i set to 1
Benchmark: timing 10000000 iterations of ?:, assign, post unless, pre unless, ||=,
||= do...
?:: 20 wallclock secs (17.57 usr + 0.01 sys = 17.58 CPU) @ 568828.21/s
(n=10000000)
assign: 15 wallclock secs (14.64 usr + 0.00 sys = 14.64 CPU) @ 683060.11/s
(n=10000000)
dna2.chem.ou.edu%
dna2.chem.ou.edu%
dna2.chem.ou.edu%
dna2.chem.ou.edu% time !!
time OTPerl_benchmark.pl
test with $i set to 1
Benchmark: timing 10000000 iterations of ?:, assign, post unless, pre unless, ||=,
||= do...
?:: 12 wallclock secs (11.78 usr + 0.00 sys = 11.78 CPU) @ 848896.43/s
(n=10000000)
assign: 8 wallclock secs ( 8.78 usr + 0.00 sys = 8.78 CPU) @ 1138952.16/s
(n=10000000)
post unless: 5 wallclock secs ( 4.20 usr + 0.00 sys = 4.20 CPU) @ 2380952.38/s
(n=10000000)
pre unless: 6 wallclock secs ( 5.34 usr + 0.00 sys = 5.34 CPU) @ 1872659.18/s
(n=10000000)
||=: -1 wallclock secs (-1.45 usr + 0.00 sys = -1.45 CPU) @ -6896551.72/s
(n=10000000)
(warning: too few iterations for a reliable count)
||= do: 2 wallclock secs ( 2.37 usr + 0.00 sys = 2.37 CPU) @ 4219409.28/s
(n=10000000)
Rate ||= ?: assign pre unless post unless ||= do
||= -6.90e+06/s -- -912% -706% -468% -390% -263%
?: 848896/s -112% -- -25% -55% -64% -80%
assign 1138952/s -117% 34% -- -39% -52% -73%
pre unless 1872659/s -127% 121% 64% -- -21% -56%
post unless 2380952/s -135% 180% 109% 27% -- -44%
||= do 4219409/s -161% 397% 270% 125% 77% --
test with $i set to 0
Benchmark: timing 10000000 iterations of ?:, assign, post unless, pre unless, ||=,
||= do...
?:: 11 wallclock secs (10.98 usr + 0.00 sys = 10.98 CPU) @ 910746.81/s
(n=10000000)
assign: 9 wallclock secs ( 8.33 usr + 0.00 sys = 8.33 CPU) @ 1200480.19/s
(n=10000000)
post unless: 14 wallclock secs (13.37 usr + 0.00 sys = 13.37 CPU) @ 747943.16/s
(n=10000000)
pre unless: 18 wallclock secs (18.28 usr + 0.00 sys = 18.28 CPU) @ 547045.95/s
(n=10000000)
||=: 10 wallclock secs (10.24 usr + 0.00 sys = 10.24 CPU) @ 976562.50/s
(n=10000000)
||= do: 9 wallclock secs ( 9.86 usr + 0.00 sys = 9.86 CPU) @ 1014198.78/s
(n=10000000)
Rate pre unless post unless ?: ||= ||= do assign
pre unless 547046/s -- -27% -40% -44% -46% -54%
post unless 747943/s 37% -- -18% -23% -26% -38%
?: 910747/s 66% 22% -- -7% -10% -24%
||= 976562/s 79% 31% 7% -- -4% -19%
||= do 1014199/s 85% 36% 11% 4% -- -16%
assign 1200480/s 119% 61% 32% 23% 18% --
--------------------Output3--------------------
test with $i set to 1
Benchmark: timing 10000000 iterations of ?:, assign, post unless, pre unless, ||=,
||= do...
?:: 12 wallclock secs (11.84 usr + 0.00 sys = 11.84 CPU) @ 844594.59/s
(n=10000000)
assign: 10 wallclock secs (10.47 usr + 0.00 sys = 10.47 CPU) @ 955109.84/s
(n=10000000)
post unless: 0 wallclock secs (-0.12 usr + 0.00 sys = -0.12 CPU) @ -83333333.33/s
(n=10000000)
(warning: too few iterations for a reliable count)
pre unless: 1 wallclock secs ( 1.65 usr + 0.00 sys = 1.65 CPU) @ 6060606.06/s
(n=10000000)
||=: 6 wallclock secs ( 5.09 usr + 0.00 sys = 5.09 CPU) @ 1964636.54/s
(n=10000000)
||= do: 4 wallclock secs ( 3.95 usr + 0.00 sys = 3.95 CPU) @ 2531645.57/s
(n=10000000)
Rate post unless ?: assign ||= ||= do pre unless
post unless -8.33e+07/s -- -9967% -8825% -4342% -3392% -1475%
?: 844595/s -101% -- -12% -57% -67% -86%
assign 955110/s -101% 13% -- -51% -62% -84%
||= 1964637/s -102% 133% 106% -- -22% -68%
||= do 2531646/s -103% 200% 165% 29% -- -58%
pre unless 6060606/s -107% 618% 535% 208% 139% --
test with $i set to 0
Benchmark: timing 10000000 iterations of ?:, assign, post unless, pre unless, ||=,
||= do...
?:: 17 wallclock secs (15.22 usr + 0.00 sys = 15.22 CPU) @ 657030.22/s
(n=10000000)
assign: 12 wallclock secs (11.28 usr + 0.00 sys = 11.28 CPU) @ 886524.82/s
(n=10000000)
post unless: 10 wallclock secs (10.62 usr + 0.00 sys = 10.62 CPU) @ 941619.59/s
(n=10000000)
pre unless: 13 wallclock secs (12.83 usr + 0.00 sys = 12.83 CPU) @ 779423.23/s
(n=10000000)
||=: 16 wallclock secs (15.34 usr + 0.00 sys = 15.34 CPU) @ 651890.48/s
(n=10000000)
||= do: 16 wallclock secs (16.21 usr + 0.00 sys = 16.21 CPU) @ 616903.15/s
(n=10000000)
Rate ||= do ||= ?: pre unless assign post unless
||= do 616903/s -- -5% -6% -21% -30% -34%
||= 651890/s 6% -- -1% -16% -26% -31%
?: 657030/s 7% 1% -- -16% -26% -30%
pre unless 779423/s 26% 20% 19% -- -12% -17%
assign 886525/s 44% 36% 35% 14% -- -6%
post unless 941620/s 53% 44% 43% 21% 6% --
--------------------Perl -V Output--------------------
% perl -V
Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
Platform:
osname=solaris, osvers=2.8, archname=sun4-solaris
uname='sunos dna2.chem.ou.edu 5.8 generic_108528-08 sun4u sparc sunw,ultra-5_10
'
config_args='-Dcc=gcc'
hint=previous, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
Compiler:
cc='gcc', ccflags ='-fno-strict-aliasing -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O',
cppflags='-fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64'
ccversion='', gccversion='2.95.2 19991024 (release)', gccosandvers='solaris2.7'
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=8, usemymalloc=n, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/usr/local/lib '
libpth=/usr/local/lib /usr/lib /usr/ccs/lib
libs=-lsocket -lnsl -ldl -lm -lc
perllibs=-lsocket -lnsl -ldl -lm -lc
libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'
Characteristics of this binary (from libperl):
Compile-time options: USE_LARGE_FILES
Built under solaris
Compiled at Jul 26 2001 10:53:59
@INC:
/usr/local/lib/perl5/5.6.1/sun4-solaris
/usr/local/lib/perl5/5.6.1
/usr/local/lib/perl5/site_perl/5.6.1/sun4-solaris
/usr/local/lib/perl5/site_perl/5.6.1
/usr/local/lib/perl5/site_perl
.
--------------------End of Output--------------------
"Peter J . Holzer" wrote:
> On 2001-09-13 17:01:20 -0400, Rob Ransbottom wrote:
> > On Thu, 13 Sep 2001 [EMAIL PROTECTED] wrote:
> >
> > > I have a minor optimization suggestion. Instead of this:
> >
> > > unless ($sth_routine_name) {
> > > #setup statement handle
> > > }
> >
> > > Do this:
> > >
> > > $sth_routine_name ||= $dbh->prepare(....);
> >
> > > pretty sure this is more efficient than setting up the unless block.
>
> First, I should say that I think that the speed differences between
> these methods are IMHO negligible in almost all situations and that one
> should use the most readable version (I like ||=, btw. It looks neat).
>
> > True, these are about the same, but in order of speed:
> >
> > $i ||= 1;
> > $i = 1 unless $i;
> > $i = $i ? 1 : 0;
> >
> > All faster than:
> >
> > unless ( $i) { $i = 0;}
>
> Just for fun I tried that with perl 5.6.0 under Linux on a Pentium
> II/233, and unless ( $i) { $i = 1; } came out fastest (yes, I changed 0
> to 1 here. Otherwise it would not be equivalent to $i ||= 1):
>
> Rate ?: post unless ||= ||= do pre unless
> ?: 1440922/s -- -53% -57% -76% -80%
> post unless 3086420/s 114% -- -7% -49% -57%
> ||= 3333333/s 131% 8% -- -45% -53%
> ||= do 6024096/s 318% 95% 81% -- -16%
> pre unless 7142857/s 396% 131% 114% 19% --
>
> here is the script:
>
> #!/usr/local/bin/perl -w
> use strict;
> use Benchmark qw/cmpthese/;
>
> my $i = 1;
> cmpthese(5000_000, {
> '||=' => sub { $i ||= 1 },
> '||= do' => sub { $i ||= do { 1 } },
> 'post unless' => sub { $i = 1 unless $i },
> 'pre unless' => sub { unless ($i) { $i = 1 } },
> '?:' => sub { $i = $i ? $i : 1 },
> }
> );
>
> I don't trust the Benchmark module, though. It reported one negative
> "wallclock secs" value at every run, and the wallclock and usr times
> differ too much for a test which should essentially take 100% user time.
>
> hp
>
> --
> _ | Peter J. Holzer | My definition of a stupid question is
> |_|_) | Sysadmin WSR / LUGA | "a question that if you're embarassed to
> | | | [EMAIL PROTECTED] | ask it, you stay stupid."
> __/ | http://www.hjp.at/ | -- Tim Helck on dbi-users, 2001-07-30
>
> -------------------------------------------------------------------------------
> Part 1.2Type: application/pgp-signature
--
James D. White ([EMAIL PROTECTED])
Department of Chemistry and Biochemistry
University of Oklahoma
620 Parrington Oval, Room 313
Norman, OK 73019-3051
Phone: (405) 325-4912, FAX: (405) 325-7762