Re: [performance/benchmark] printing techniques
Stas Bekman <[EMAIL PROTECTED]> writes: > And the results are: > > single_print: 1 wallclock secs ( 1.74 usr + 0.05 sys = 1.79 CPU) > here_print:3 wallclock secs ( 1.79 usr + 0.07 sys = 1.86 CPU) > list_print:7 wallclock secs ( 6.57 usr + 0.01 sys = 6.58 CPU) > multi_print: 10 wallclock secs (10.72 usr + 0.03 sys = 10.75 CPU) > > Numbers tell it all, I<'single_print'> is the fastest, 'here_print' is > almost of the same speed, 'single_print' and 'here_print' compile down to exactly the same code, so there should not be any real difference between them. -- Gisle Aas
Re: [OT] Re: [performance/benchmark] printing techniques
> It's not slower in 5.6. "$x and $y" in 5.6 gets turned into $x . ' and ' > . $y (in perl bytecode terms). that's not new to 5.6.0, variable interpolation in ""'s has always turned into a concat tree, though 5.005_03 is the oldest version i have handy to check with. and, this "$feature" can be quite expensive.
Re: [performance/benchmark] printing techniques
On Thu, 8 Jun 2000, Stas Bekman wrote: > Stephen Zander wrote: > > > > > "Stas" == Stas Bekman <[EMAIL PROTECTED]> writes: > > Stas> Ouch :( Someone to explain this phenomena? and it's just > > Stas> fine under the handler puzzled, what can I say... > > > > Continuous array growth and copying? > > Is this a question or a suggestion? but in both cases (mod_perl and perl > benchmark) the process doesn't exit, so the allocated datastructure is > reused... anyway it should be the same. But it's not. only the @array length remains allocated, it's elements are free-d each time, you're copying the "strings" into a new SV everytime. push @array, \"string" should make a big difference (remember $r->print deferences refs to strings)
Re: [OT] Re: [performance/benchmark] printing techniques
On Thu, 8 Jun 2000, Perrin Harkins wrote: > On Thu, 8 Jun 2000, Matt Sergeant wrote: > > > > The one that bugs me is when I see people doing this: > > > > > > $hash{"$key"} > > > > > > instead of this: > > > > > > $hash{$key} > > > > Those two now also result in the same code. ;-) > > > > But the former is just ugly. > > Sometimes it's worse than just ugly. See the entry in the Perl FAQ: > >http://www.perl.com/pub/doc/manual/html/pod/perlfaq4.html#What_s_wrong_with_always_quoting > > Not likely that anyone would be using something as a hash key that would > suffer from being stringified, but possible. It's definitely a bit slower > as well, but that's below the noise level. It's not slower in 5.6. "$x and $y" in 5.6 gets turned into $x . ' and ' . $y (in perl bytecode terms). -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
Re: [OT] Re: [performance/benchmark] printing techniques
> Sometimes it's worse than just ugly. See the entry in the Perl FAQ: > http://www.perl.com/pub/doc/manual/html/pod/perlfaq4.html#What_s_wrong_with_ always_quoting > > Not likely that anyone would be using something as a hash key that would > suffer from being stringified, but possible. It's definitely a bit slower > as well, but that's below the noise level. Actually, when you use a reference as a hash key, it is automatically stringified anyway. http://www.perl.com/pub/doc/manual/html/pod/perlfaq4.html#How_can_I_use_a_re ference_as_a_h So that means that: $hash{"$key"} and $hash{$key}differ only in the relative merits of their beauty. :) Mike Lambert
Re: [OT] Re: [performance/benchmark] printing techniques
On Thu, 8 Jun 2000, Matt Sergeant wrote: > > The one that bugs me is when I see people doing this: > > > > $hash{"$key"} > > > > instead of this: > > > > $hash{$key} > > Those two now also result in the same code. ;-) > > But the former is just ugly. Sometimes it's worse than just ugly. See the entry in the Perl FAQ: http://www.perl.com/pub/doc/manual/html/pod/perlfaq4.html#What_s_wrong_with_always_quoting Not likely that anyone would be using something as a hash key that would suffer from being stringified, but possible. It's definitely a bit slower as well, but that's below the noise level. - Perrin
Re: [performance/benchmark] printing techniques
[Sorry for the delay: didn't notice this since it was sent only to the list] Eric Cholet wrote, in part: > > I never advocated optimizing at the expense of the above criteria, we > were discussing optimizations only. I certainly believe a program is a > compromise, and have often chosen some of those criteria as being > more important than performance savings. Sorry: I took your statement at face value. I'm well aware that you're not that shallow :-). - Barrie
Re: [performance/benchmark] printing techniques
On 8 Jun 2000, Stephen Zander wrote: > As Matt has already commented, in the handler the method call > overheads swamps all the other activities. so concat_print & > aggrlist_print (yes, method invocation in perl really is that bad). > When you remove that overhead the extra OPs in aggrlist_print become > the dominating factor. Perhaps it would be worth testing the horribly ugly: Apache::print($r, ); Rather than plain print(). -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
Re: [performance/benchmark] printing techniques
> "Stas" == Stas Bekman <[EMAIL PROTECTED]> writes: Stas> Is this a question or a suggestion? but in both cases Stas> (mod_perl and perl benchmark) the process doesn't exit, so Stas> the allocated datastructure is reused... anyway it should be Stas> the same. But it's not. It was a suggestion. Examining the optrees produced by aggrlist_print and the following two routines which should be equivalent to concat_print and multi_print from your original posting sub concat_print{ my $buffer; $buffer .= "\n"; $buffer .= "\n"; $buffer .= " \n"; $buffer .= "\n"; print $buffer; } sub aggrlist_print{ my @buffer = (); push @buffer,"\n"; push @buffer,"\n"; push @buffer," \n"; push @buffer,"\n"; print @buffer; } sub multi_print{ print "\n"; print "\n"; print " \n"; print "\n"; } shows that aggrlist_print performs 25% OPs than concat_list and 43% more OPs than multi_print. Stas> handler: Stas> concat_print|111 5000 0876 Stas> aggrlist_print |113 5000 0862 Stas> multi_print |118 5000 0820 Stas> buffered benchmark: Stas> concat_print:8 wallclock secs ( 8.23 usr + 0.05 sys = 8.28 CPU) Stas> multi_print:10 wallclock secs (10.70 usr + 0.01 sys = 10.71 CPU) Stas> aggrlist_print: 30 wallclock secs (31.06 usr + 0.04 sys = 31.10 CPU) Stas> Watch the aggrlist_print gives such a bad perl benchmark, Stas> but very good handler benchmark... As Matt has already commented, in the handler the method call overheads swamps all the other activities. so concat_print & aggrlist_print (yes, method invocation in perl really is that bad). When you remove that overhead the extra OPs in aggrlist_print become the dominating factor. -- Stephen "So if she weighs the same as a duck, she's made of wood."... "And therefore?"... "A witch!"
Re: [performance/benchmark] printing techniques
Stephen Zander wrote: > > > "Stas" == Stas Bekman <[EMAIL PROTECTED]> writes: > Stas> Ouch :( Someone to explain this phenomena? and it's just > Stas> fine under the handler puzzled, what can I say... > > Continuous array growth and copying? Is this a question or a suggestion? but in both cases (mod_perl and perl benchmark) the process doesn't exit, so the allocated datastructure is reused... anyway it should be the same. But it's not. Just to remind the context (please quote the relevant parts or it's impossible to understand what are you talking about. Thanks!): handler: single_print|108 5000 0890 here_print |110 5000 0887 concat_print|111 5000 0876 aggrlist_print |113 5000 0862 list_print |113 5000 0861 multi_print |118 5000 0820 unbuffered benchmark: single_print:2 wallclock secs ( 2.29 usr + 0.46 sys = 2.75 CPU) here_print: 2 wallclock secs ( 2.42 usr + 0.50 sys = 2.92 CPU) list_print: 7 wallclock secs ( 7.26 usr + 0.53 sys = 7.79 CPU) concat_print:9 wallclock secs ( 8.90 usr + 0.60 sys = 9.50 CPU) aggrlist_print: 32 wallclock secs (32.37 usr + 0.71 sys = 33.08 CPU) multi_print:21 wallclock secs (16.47 usr + 5.84 sys = 22.31 CPU) buffered benchmark: single_print:3 wallclock secs ( 1.69 usr + 0.02 sys = 1.71 CPU) here_print: 3 wallclock secs ( 1.76 usr + 0.01 sys = 1.77 CPU) list_print: 7 wallclock secs ( 6.41 usr + 0.03 sys = 6.44 CPU) concat_print:8 wallclock secs ( 8.23 usr + 0.05 sys = 8.28 CPU) multi_print:10 wallclock secs (10.70 usr + 0.01 sys = 10.71 CPU) aggrlist_print: 30 wallclock secs (31.06 usr + 0.04 sys = 31.10 CPU) Watch the aggrlist_print gives such a bad perl benchmark, but very good handler benchmark... sub aggrlist_print{ my @buffer = (); push @buffer,"\n"; push @buffer,"\n"; push @buffer," \n"; [snip] push @buffer,"\n"; print @buffer; } _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [performance/benchmark] printing techniques
> "Stas" == Stas Bekman <[EMAIL PROTECTED]> writes: Stas> Ouch :( Someone to explain this phenomena? and it's just Stas> fine under the handler puzzled, what can I say... Continuous array growth and copying? -- Stephen "So if she weighs the same as a duck, she's made of wood."... "And therefore?"... "A witch!"
Re: [performance/benchmark] printing techniques
From: "Matt Sergeant" <[EMAIL PROTECTED]> To: "Stas Bekman" <[EMAIL PROTECTED]> Cc: "___cliff rayman___" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: 08 June 2000 09:23 Subject: Re: [performance/benchmark] printing techniques : On Wed, 7 Jun 2000, Stas Bekman wrote: : : > On Wed, 7 Jun 2000, ___cliff rayman___ wrote: : > : > > : > > : > > Stas Bekman wrote: : > > : > > > : > > > : > > > Per your request: : > > > : > > > The handler: : > > > : > > > query | avtime completed failedrps : > > > --- : > > > single_print |110 5000 0881 : > > > here_print|111 5000 0881 : > > > list_print|111 5000 0880 : > > > concat_print |111 5000 0873 : > > > multi_print |119 5000 0820 : > > > --- : > > : > > not very much difference once stuck in a handler. : > > obviously multi_print is both ugly and slow, but the rest should be used by the : > > discretion of the programmer based on the one that is easiest to maintain in : > > the code. : > : > absolutely. I'd also love to know why is it different under the handler. : > (talking about relative performance!) : : Because as I said - the method dispatch and the overhead of the mod_perl : handler takes over. multi-print is the only one that has to call methods : several times. The rest are almost equal. : : This also demonstrates some of the value in template systems that send all : their output at once, however often these template systems use method : calls too, so it all gets messed up. This may be veering off topic - but its been on my mind for a while now Apart from thanking Stas for his benchmark work, which I find very interesting (does he sleep ;-) - this and few few others (benchmarks) have all touched on the area of including mod_perl output within HTML. I have always wonder what everyone else is doing on this front. I usually suck a template into memory (one long line) - usually done at startup. I then create all the conent with either pushing onto an array, or .= string concatination. Finally I regex the template - looking for my tags and replave those with output. Needless to say that one page can onsists of many templates (page or inside of table (bits from ) etc ...). >From Stas previous benchmarks I've preloaded the mysql driver and now usually use the "push" onto array to prepare content - Thanks Stas. Who does everyone else do it ? Can this type of operation (that everyone must do at some time) be optimised as aggressively as some of the others ? Yet still keep the abstraction between design and content. Greg Cope : : -- : : : Fastnet Software Ltd. High Performance Web Specialists : Providing mod_perl, XML, Sybase and Oracle solutions : Email for training and consultancy availability. : http://sergeant.org http://xml.sergeant.org : :
Re: [OT] Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, Perrin Harkins wrote: > On Wed, 7 Jun 2000, Matt Sergeant wrote: > > > On Wed, 7 Jun 2000, Eric Cholet wrote: > > > > > This said, i hurry back to s/"constant strings"/'constant strings'/g; > > > > Those two are equal. > > Yes, although it's counter-intutive there's no real performance hit > from double-quoting constant strings. > > The one that bugs me is when I see people doing this: > > $hash{"$key"} > > instead of this: > > $hash{$key} Those two now also result in the same code. ;-) But the former is just ugly. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, Stas Bekman wrote: > On Wed, 7 Jun 2000, ___cliff rayman___ wrote: > > > > > > > Stas Bekman wrote: > > > > > > > > > > > Per your request: > > > > > > The handler: > > > > > > query | avtime completed failedrps > > > --- > > > single_print |110 5000 0881 > > > here_print|111 5000 0881 > > > list_print|111 5000 0880 > > > concat_print |111 5000 0873 > > > multi_print |119 5000 0820 > > > --- > > > > not very much difference once stuck in a handler. > > obviously multi_print is both ugly and slow, but the rest should be used by the > > discretion of the programmer based on the one that is easiest to maintain in > > the code. > > absolutely. I'd also love to know why is it different under the handler. > (talking about relative performance!) Because as I said - the method dispatch and the overhead of the mod_perl handler takes over. multi-print is the only one that has to call methods several times. The rest are almost equal. This also demonstrates some of the value in template systems that send all their output at once, however often these template systems use method calls too, so it all gets messed up. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
Re: [performance/benchmark] printing techniques
> What the other programmer here and I do is setup an array and push() > our lines of output onto it throughout all our code, and print it at > the very end. I'd be interested in seeing benchmarks of this vs. > the other methods. I'll try to find the time to run them. handler: query | avtime completed failedrps - single_print|108 5000 0890 here_print |110 5000 0887 concat_print|111 5000 0876 aggrlist_print |113 5000 0862 list_print |113 5000 0861 multi_print |118 5000 0820 - unbuffered benchmark single_print:2 wallclock secs ( 2.29 usr + 0.46 sys = 2.75 CPU) here_print: 2 wallclock secs ( 2.42 usr + 0.50 sys = 2.92 CPU) list_print: 7 wallclock secs ( 7.26 usr + 0.53 sys = 7.79 CPU) concat_print:9 wallclock secs ( 8.90 usr + 0.60 sys = 9.50 CPU) aggrlist_print: 32 wallclock secs (32.37 usr + 0.71 sys = 33.08 CPU) multi_print:21 wallclock secs (16.47 usr + 5.84 sys = 22.31 CPU) buffered benchmark single_print:3 wallclock secs ( 1.69 usr + 0.02 sys = 1.71 CPU) here_print: 3 wallclock secs ( 1.76 usr + 0.01 sys = 1.77 CPU) list_print: 7 wallclock secs ( 6.41 usr + 0.03 sys = 6.44 CPU) concat_print:8 wallclock secs ( 8.23 usr + 0.05 sys = 8.28 CPU) multi_print:10 wallclock secs (10.70 usr + 0.01 sys = 10.71 CPU) aggrlist_print: 30 wallclock secs (31.06 usr + 0.04 sys = 31.10 CPU) Ouch :( Someone to explain this phenomena? and it's just fine under the handler puzzled, what can I say... here is the code delta... sub aggrlist_print{ my @buffer = (); push @buffer,"\n"; push @buffer,"\n"; push @buffer," \n"; push @buffer,"\n"; push @buffer," Test page\n"; push @buffer,"\n"; push @buffer," \n"; push @buffer," \n"; push @buffer," \n"; push @buffer," Test page \n"; push @buffer,"\n"; push @buffer,"foo\n"; push @buffer,"\n"; push @buffer," \n"; push @buffer,"\n"; print @buffer; } _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
RE: [performance/benchmark] printing techniques
What about heredoc with the magical @{} technique for interpolating functions? or Text::iPerl ? I'd be interested in knwing how they stack up o _ /|/ | Jerrad Pierce \ | __|_ _| /||/ http://pthbb.org . | _| | \|| _.-~-._.-~-._.-~-._@" _|\_|___|___|
[OT] Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, Matt Sergeant wrote: > On Wed, 7 Jun 2000, Eric Cholet wrote: > > > This said, i hurry back to s/"constant strings"/'constant strings'/g; > > Those two are equal. Yes, although it's counter-intutive there's no real performance hit from double-quoting constant strings. The one that bugs me is when I see people doing this: $hash{"$key"} instead of this: $hash{$key} That one is actually in the perlfaq man page, but I still see it all the time. The performance difference is very small but it does exist, and you can get unintended results from stringifying some things. - Perrin
Re: [performance/benchmark] printing techniques
.--[ Jeff Norman wrote (2000/06/07 at 14:27:29) ]-- | | Frequently, it's hard to build up an entire output segment without | code in-between the different additions to the output. I guess you could | call this the "append, append, append... output" technique. | | I think it would be an interesting addition to the benchmark: | | sub gather_print{ | my $buffer = ''; | $buffer .= ""; | $buffer .= ""; | $buffer .= " "; | $buffer .= ""; | $buffer .= " Test page"; | $buffer .= ""; | $buffer .= " "; | $buffer .= " "; | $buffer .= " "; | $buffer .= " Test page "; | $buffer .= ""; | $buffer .= "foo"; | $buffer .= ""; | $buffer .= " "; | $buffer .= ""; | print $fh $buffer; | } | `- What the other programmer here and I do is setup an array and push() our lines of output onto it throughout all our code, and print it at the very end. I'd be interested in seeing benchmarks of this vs. the other methods. I'll try to find the time to run them. --- Frank Wiles <[EMAIL PROTECTED]> http://frank.wiles.org ---
Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, ___cliff rayman___ wrote: > > > Stas Bekman wrote: > > > > > > > Per your request: > > > > The handler: > > > > query | avtime completed failedrps > > --- > > single_print |110 5000 0881 > > here_print|111 5000 0881 > > list_print|111 5000 0880 > > concat_print |111 5000 0873 > > multi_print |119 5000 0820 > > --- > > not very much difference once stuck in a handler. > obviously multi_print is both ugly and slow, but the rest should be used by the > discretion of the programmer based on the one that is easiest to maintain in > the code. absolutely. I'd also love to know why is it different under the handler. (talking about relative performance!) > > The benchmark unbuffered: > > single_print: 2 wallclock secs ( 2.44 usr + 0.31 sys = 2.75 CPU) > > here_print:4 wallclock secs ( 2.34 usr + 0.54 sys = 2.88 CPU) > > list_print:8 wallclock secs ( 7.06 usr + 0.43 sys = 7.49 CPU) > > concat_print: 9 wallclock secs ( 8.95 usr + 0.66 sys = 9.61 CPU) > > multi_print: 22 wallclock secs (16.94 usr + 5.74 sys = 22.68 CPU) > > > > The benchmark unbuffered: > > should this say "The benchmark buffered"?? oops, buffered of course (copy-n-paste typo) > > single_print: 1 wallclock secs ( 1.70 usr + 0.02 sys = 1.72 CPU) > > here_print:1 wallclock secs ( 1.78 usr + 0.01 sys = 1.79 CPU) > > list_print:7 wallclock secs ( 6.44 usr + 0.05 sys = 6.49 CPU) > > concat_print: 9 wallclock secs ( 8.04 usr + 0.06 sys = 8.10 CPU) > > multi_print: 10 wallclock secs (10.56 usr + 0.09 sys = 10.65 CPU) > > > > The interesting thing is that list_print and concat_print are quite bad in > > the benchmark but very good in the handler. The rest holds. > > -- > ___cliff [EMAIL PROTECTED] > > > _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, Stas Bekman wrote: > And the results are: > > single_print: 1 wallclock secs ( 1.74 usr + 0.05 sys = 1.79 CPU) > here_print:3 wallclock secs ( 1.79 usr + 0.07 sys = 1.86 CPU) > list_print:7 wallclock secs ( 6.57 usr + 0.01 sys = 6.58 CPU) > multi_print: 10 wallclock secs (10.72 usr + 0.03 sys = 10.75 CPU) Never mind the performance, that multi_print and list_print are just U-G-L-Y! Here doc rules. Great for SQL too. - Perrin
Re: [performance/benchmark] printing techniques
Stas Bekman wrote: > > > Per your request: > > The handler: > > query | avtime completed failedrps > --- > single_print |110 5000 0881 > here_print|111 5000 0881 > list_print|111 5000 0880 > concat_print |111 5000 0873 > multi_print |119 5000 0820 > --- not very much difference once stuck in a handler. obviously multi_print is both ugly and slow, but the rest should be used by the discretion of the programmer based on the one that is easiest to maintain in the code. > > > The benchmark unbuffered: > single_print: 2 wallclock secs ( 2.44 usr + 0.31 sys = 2.75 CPU) > here_print:4 wallclock secs ( 2.34 usr + 0.54 sys = 2.88 CPU) > list_print:8 wallclock secs ( 7.06 usr + 0.43 sys = 7.49 CPU) > concat_print: 9 wallclock secs ( 8.95 usr + 0.66 sys = 9.61 CPU) > multi_print: 22 wallclock secs (16.94 usr + 5.74 sys = 22.68 CPU) > > The benchmark unbuffered: should this say "The benchmark buffered"?? > > single_print: 1 wallclock secs ( 1.70 usr + 0.02 sys = 1.72 CPU) > here_print:1 wallclock secs ( 1.78 usr + 0.01 sys = 1.79 CPU) > list_print:7 wallclock secs ( 6.44 usr + 0.05 sys = 6.49 CPU) > concat_print: 9 wallclock secs ( 8.04 usr + 0.06 sys = 8.10 CPU) > multi_print: 10 wallclock secs (10.56 usr + 0.09 sys = 10.65 CPU) > > The interesting thing is that list_print and concat_print are quite bad in > the benchmark but very good in the handler. The rest holds. -- ___cliff [EMAIL PROTECTED]
Re: [performance/benchmark] printing techniques
Frequently, it's hard to build up an entire output segment without code in-between the different additions to the output. I guess you could call this the "append, append, append... output" technique. I think it would be an interesting addition to the benchmark: sub gather_print{ my $buffer = ''; $buffer .= ""; $buffer .= ""; $buffer .= " "; $buffer .= ""; $buffer .= " Test page"; $buffer .= ""; $buffer .= " "; $buffer .= " "; $buffer .= " "; $buffer .= " Test page "; $buffer .= ""; $buffer .= "foo"; $buffer .= ""; $buffer .= " "; $buffer .= ""; print $fh $buffer; } On Wed, 7 Jun 2000, Stas Bekman wrote: > Following Tim's comments here is the new benchmark. (I'll address the > buffering issue in another post) >
Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, Jeff Norman wrote: > > > Frequently, it's hard to build up an entire output segment without > code in-between the different additions to the output. I guess you could > call this the "append, append, append... output" technique. > > I think it would be an interesting addition to the benchmark: > >sub gather_print{ > my $buffer = ''; > $buffer .= ""; > $buffer .= ""; > $buffer .= " "; > $buffer .= ""; > $buffer .= " Test page"; > $buffer .= ""; > $buffer .= " "; > $buffer .= " "; > $buffer .= " "; > $buffer .= " Test page "; > $buffer .= ""; > $buffer .= "foo"; > $buffer .= ""; > $buffer .= " "; > $buffer .= ""; > print $fh $buffer; >} Per your request: The handler: query | avtime completed failedrps --- single_print |110 5000 0881 here_print|111 5000 0881 list_print|111 5000 0880 concat_print |111 5000 0873 multi_print |119 5000 0820 --- The benchmark unbuffered: single_print: 2 wallclock secs ( 2.44 usr + 0.31 sys = 2.75 CPU) here_print:4 wallclock secs ( 2.34 usr + 0.54 sys = 2.88 CPU) list_print:8 wallclock secs ( 7.06 usr + 0.43 sys = 7.49 CPU) concat_print: 9 wallclock secs ( 8.95 usr + 0.66 sys = 9.61 CPU) multi_print: 22 wallclock secs (16.94 usr + 5.74 sys = 22.68 CPU) The benchmark unbuffered: single_print: 1 wallclock secs ( 1.70 usr + 0.02 sys = 1.72 CPU) here_print:1 wallclock secs ( 1.78 usr + 0.01 sys = 1.79 CPU) list_print:7 wallclock secs ( 6.44 usr + 0.05 sys = 6.49 CPU) concat_print: 9 wallclock secs ( 8.04 usr + 0.06 sys = 8.10 CPU) multi_print: 10 wallclock secs (10.56 usr + 0.09 sys = 10.65 CPU) The interesting thing is that list_print and concat_print are quite bad in the benchmark but very good in the handler. The rest holds. > > > > On Wed, 7 Jun 2000, Stas Bekman wrote: > > > Following Tim's comments here is the new benchmark. (I'll address the > > buffering issue in another post) > > > > _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [performance/benchmark] printing techniques
> > These > > things add up, so don't you think that whatever can be optimized, should ? > > Wrong question, IMHO: it's what you optimize for that counts. Several things > come to mind that are often more important than performance and often mean not > optimizing for performance (these are interrelated, of course): > > Stability / reliability > Maintainability > Development time > Memory usage > Clarity of design (API, data structures, etc) I never advocated optimizing at the expense of the above criteria, we were discussing optimizations only. I certainly believe a program is a compromise, and have often chosen some of those criteria as being more important than performance savings. > There's a related rule of thumb that says don't optimize until you can test it > to see what the slow parts are. Humans are pretty bad at predicting where the > bottlenecks are. Neither did I say that optimizations should be carried out without first determining whether they're worth it or not. Run benchmarks, optimize what the benchmark shows to be slow. The point of the discussion was, is it worth it to save a few microseconds here when milliseconds are being spent there. My point was, yes it's worth it, every microsecond counts on a busy site. > I think of it this way: if your process spends 80% of it's time in 20% of your > code, then you should only be thinking of performance optimizing that 20%, and > then only if you identify a problem there. Of course, there are critical sections > that may need to operate lightening quick, but they're pretty few and far between > outside of real-time, embedded, or kernel hacking. I don't see, provided I have the time and the need (ie my server's resources are strained) why I should not, once I have optimized that 20%, turn to the other 80% and see what I can do there too. > - Barrie > -- Eric
Re: [performance/benchmark] printing techniques
> > I don't understand what you're getting at. Does this mean that something > > shouldn't be optimized because there's something else in the process that > > is taking more time? For example I have a database powered site, the slowest > > part of request processing is fetching data from the database. Should I > > disregard any optimization not dealing with the database fetches ? These > > things add up, so don't you think that whatever can be optimized, should ? > > Of course the slowest stuff should be optimized first, but that doesn't > > mean that other optimisations are useless. > > Of course you can optimize forever, but some optimizations aren't going to > make a whole lot of difference. This is one of those optimizations, > judging by these benchmarks. Let Stas re-write this benchmark test as a > handler() and see what kind of difference it makes. I'm willing to > bet: barely any between averages. > > Perhaps I was a little strong: Lets not deprecate this part of the guide, > just provide some realism in the conclusion. here we go, the benchmark holds for all but list_print!!! query | avtime completed failedrps --- here_print|109 5000 0894 single_print |110 5000 0883 list_print|111 5000 0877 multi_print |118 5000 0817 --- Here is the module used in benchmarking: package MyPrint; use Apache::Constants qw(:common); use Apache::URI (); my %callbacks = ( list_print => \&list_print, multi_print => \&multi_print, single_print => \&single_print, here_print => \&here_print, ); sub handler{ my $r = shift; $r->send_http_header('text/plain'); my $uri = Apache::URI->parse($r); my $query = $uri->query; return DECLINED unless $callbacks{$query}; &{$callbacks{$query}}; return OK; } sub multi_print{ print "\n"; print "\n"; print " \n"; print "\n"; print " Test page\n"; print "\n"; print " \n"; print " \n"; print " \n"; print " Test page \n"; print "\n"; print "foo\n"; print "\n"; print " \n"; print "\n"; } sub single_print{ print qq{ Test page Test page foo }; } sub here_print{ print <<__EOT__; Test page Test page foo __EOT__ } sub list_print{ print "\n", "\n", " \n", "\n", " Test page\n", "\n", " \n", " \n", " \n", " Test page \n", "\n", "foo\n", "\n", " \n", "\n"; } 1; _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [performance/benchmark] printing techniques
Eric Cholet wrote: > > These > things add up, so don't you think that whatever can be optimized, should ? Wrong question, IMHO: it's what you optimize for that counts. Several things come to mind that are often more important than performance and often mean not optimizing for performance (these are interrelated, of course): Stability / reliability Maintainability Development time Memory usage Clarity of design (API, data structures, etc) There's a related rule of thumb that says don't optimize until you can test it to see what the slow parts are. Humans are pretty bad at predicting where the bottlenecks are. I think of it this way: if your process spends 80% of it's time in 20% of your code, then you should only be thinking of performance optimizing that 20%, and then only if you identify a problem there. Of course, there are critical sections that may need to operate lightening quick, but they're pretty few and far between outside of real-time, embedded, or kernel hacking. - Barrie
Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, Eric Cholet wrote: > This said, i hurry back to s/"constant strings"/'constant strings'/g; Those two are equal. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
Re: [performance/benchmark] printing techniques
>From: "Eric Strovink" <[EMAIL PROTECTED]> > > Of course the slowest stuff should be optimized first... > > Right. Which means the Guide, if it is not already so doing, ought to > rank-order the optimizations in their order of importance, or better, their > relative importance. This one, it appears, should be near the bottom of the > list. >From: "Matt Sergeant" <[EMAIL PROTECTED]> > > Of course you can optimize forever, but some optimizations aren't going to > make a whole lot of difference. This is one of those optimizations, > judging by these benchmarks. Let Stas re-write this benchmark test as a > handler() and see what kind of difference it makes. I'm willing to > bet: barely any between averages. > > Perhaps I was a little strong: Lets not deprecate this part of the guide, > just provide some realism in the conclusion. Agreed, all optimizations should be put under perspective, and the guide (and book :-) should put forward those that count most. This said, i hurry back to s/"constant strings"/'constant strings'/g; -- Eric
Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, Eric Cholet wrote: > > > So if you want a better performance, you know what technique to use. > > > > I think this last line is misleading. The reality is that you're doing > > 500,000 iterations here. Even for the worst case scenario of multi_print > > with no buffering you're managing nearly 22,000 outputs a second. Now > > granted, the output isn't exactly of normal size, but I think what it > > comes down to is that the way you choose to print is going to make almost > > zero difference in any real world mod_perl application. The overhead of > > URL parsing, resource location, and actually running your handler is going > > to take far more overhead by the looks of things. > > I don't understand what you're getting at. Does this mean that something > shouldn't be optimized because there's something else in the process that > is taking more time? For example I have a database powered site, the slowest > part of request processing is fetching data from the database. Should I > disregard any optimization not dealing with the database fetches ? These > things add up, so don't you think that whatever can be optimized, should ? > Of course the slowest stuff should be optimized first, but that doesn't > mean that other optimisations are useless. Of course you can optimize forever, but some optimizations aren't going to make a whole lot of difference. This is one of those optimizations, judging by these benchmarks. Let Stas re-write this benchmark test as a handler() and see what kind of difference it makes. I'm willing to bet: barely any between averages. Perhaps I was a little strong: Lets not deprecate this part of the guide, just provide some realism in the conclusion. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
Re: [performance/benchmark] printing techniques
Eric Cholet wrote: > Of course the slowest stuff should be optimized first... Right. Which means the Guide, if it is not already so doing, ought to rank-order the optimizations in their order of importance, or better, their relative importance. This one, it appears, should be near the bottom of the list.
Re: [performance/benchmark] printing techniques
> > So if you want a better performance, you know what technique to use. > > I think this last line is misleading. The reality is that you're doing > 500,000 iterations here. Even for the worst case scenario of multi_print > with no buffering you're managing nearly 22,000 outputs a second. Now > granted, the output isn't exactly of normal size, but I think what it > comes down to is that the way you choose to print is going to make almost > zero difference in any real world mod_perl application. The overhead of > URL parsing, resource location, and actually running your handler is going > to take far more overhead by the looks of things. I don't understand what you're getting at. Does this mean that something shouldn't be optimized because there's something else in the process that is taking more time? For example I have a database powered site, the slowest part of request processing is fetching data from the database. Should I disregard any optimization not dealing with the database fetches ? These things add up, so don't you think that whatever can be optimized, should ? Of course the slowest stuff should be optimized first, but that doesn't mean that other optimisations are useless. -- Eric
Re: [performance/benchmark] printing techniques
[benchmark code snipped] > > single_print: 4 wallclock secs ( 2.28 usr + 0.47 sys = 2.75 CPU) > > here_print:2 wallclock secs ( 2.45 usr + 0.45 sys = 2.90 CPU) > > list_print:7 wallclock secs ( 7.17 usr + 0.45 sys = 7.62 CPU) > > multi_print: 23 wallclock secs (17.52 usr + 5.72 sys = 23.24 CPU) > > > > The results are worse by the factor of 1.5 to 2, with only > > I<'list_print'> changed by very little. > > > > So if you want a better performance, you know what technique to use. > > I think this last line is misleading. The reality is that you're doing > 500,000 iterations here. Even for the worst case scenario of multi_print > with no buffering you're managing nearly 22,000 outputs a second. Now > granted, the output isn't exactly of normal size, but I think what it > comes down to is that the way you choose to print is going to make almost > zero difference in any real world mod_perl application. The overhead of > URL parsing, resource location, and actually running your handler is going > to take far more overhead by the looks of things. > > Perhaps this section should be (re)moved into a posterity section, for it > seems fairly un-informative to me. Matt, Have you seen all these scripts with hundreds of print statements? This section comes to open the eyes of programmers who tend to use this style. Obviously, that if write the normal code the real choice doesn't really matter, unless you do lots of printings. But, remember that each of the performance sections of the guide can be deleted following your suggestion. Each section tackles a separate feature/technique. The overall approach only matters. My goal is to show programmers how to squeeze more out of their code, definitely I'm not talking to people who run guestbooks code. Take for example Ask and Nick from ValueClick. Let's ask them whether these techniques matter or not. With 70-80M requests served daily each saved millisecond counts. Ask? Nick? What do you think? _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [performance/benchmark] printing techniques
On Wed, 7 Jun 2000, Stas Bekman wrote: > Following Tim's comments here is the new benchmark. (I'll address the > buffering issue in another post) > > use Benchmark; > use Symbol; > > my $fh = gensym; > open $fh, ">/dev/null" or die; > > sub multi_print{ > print $fh ""; > print $fh ""; > print $fh " "; > print $fh ""; > print $fh " Test page"; > print $fh ""; > print $fh " "; > print $fh " "; > print $fh " "; > print $fh " Test page "; > print $fh ""; > print $fh "foo"; > print $fh ""; > print $fh " "; > print $fh ""; > } > > sub single_print{ > print $fh qq{ > > > > Test page > > > > > Test page > > foo > > > > }; > } > > sub here_print{ > print $fh <<__EOT__; > > > > > Test page > > > > > Test page > > foo > > > > __EOT__ > } > > sub list_print{ > print $fh "", > "", > " ", > "", > " Test page", > "", > " ", > " ", > " ", > " Test page ", > "", > "foo", > "", > " ", > ""; > } > > timethese > (500_000, { > list_print => \&list_print, > multi_print => \&multi_print, > single_print => \&single_print, > here_print => \&here_print, > }); > > And the results are: > > single_print: 1 wallclock secs ( 1.74 usr + 0.05 sys = 1.79 CPU) > here_print:3 wallclock secs ( 1.79 usr + 0.07 sys = 1.86 CPU) > list_print:7 wallclock secs ( 6.57 usr + 0.01 sys = 6.58 CPU) > multi_print: 10 wallclock secs (10.72 usr + 0.03 sys = 10.75 CPU) > > Numbers tell it all, I<'single_print'> is the fastest, 'here_print' is > almost of the same speed, I<'list_print'> is quite slow and > I<'multi_print'> is the slowest. > > If we run the same benchmark using the unbuffered prints by changing > the beginning of the code to: > > use Symbol; > my $fh = gensym; > open $fh, ">/dev/null" or die; > > # make all the calls unbuffered > my $oldfh = select($fh); > $| = 1; > select($oldfh); > > And the results are: > > single_print: 4 wallclock secs ( 2.28 usr + 0.47 sys = 2.75 CPU) > here_print:2 wallclock secs ( 2.45 usr + 0.45 sys = 2.90 CPU) > list_print:7 wallclock secs ( 7.17 usr + 0.45 sys = 7.62 CPU) > multi_print: 23 wallclock secs (17.52 usr + 5.72 sys = 23.24 CPU) > > The results are worse by the factor of 1.5 to 2, with only > I<'list_print'> changed by very little. > > So if you want a better performance, you know what technique to use. I think this last line is misleading. The reality is that you're doing 500,000 iterations here. Even for the worst case scenario of multi_print with no buffering you're managing nearly 22,000 outputs a second. Now granted, the output isn't exactly of normal size, but I think what it comes down to is that the way you choose to print is going to make almost zero difference in any real world mod_perl application. The overhead of URL parsing, resource location, and actually running your handler is going to take far more overhead by the looks of things. Perhaps this section should be (re)moved into a posterity section, for it seems fairly un-informative to me. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org