Re: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-08 Thread tomcat/perl

On 08.02.2021 10:09, Steven Haigh wrote:

On Sun, Feb 7, 2021 at 15:17, Chris  wrote:

Just remember to always write clean code that resets variables after doing 
tasks.


I'm a bit curious about this - whilst I'm still testing all this on a staging environment, 
how can I tell if things can leak between runs?


Is coding to normal 'use strict; use warnings;' standards good enough?


Hi. Read this, carefully :
http://perl.apache.org/docs/2.0/user/troubleshooting/troubleshooting.html#Variable__x_will_not_stay_shared_at

In particular the example below "An Easy Break-in".
(and replace
use vars ($authenticated);
by
our $authenticated;
)

("use vars" is deprecated, see https://perldoc.perl.org/vars)

The point is :
- imagine an Apache Prefork starting 5 children
- each child, when it starts, contains its own fresh copy of the perl 
interpreter
- requests which come in, are directed by the main Apache, to any child that is free at 
the time.
- when a child runs your script/module for the first time (with its particular perl 
interpreter), the script/module gets compiled, and then run, and the compiled code is 
cached by the corresponding perl interpreter
- global variables (such as $authenticated here) get defined during compilation, so are 
part of the cached script/module code.
- so the first time an Apache child runs your script, $authenticated is defined, but 
"empty" (undef). Then when the script runs, it assigns a value to it, so it is no longer 
undef.
- the next time your script is run *by the same Apache child*, the cached compiled version 
of the script is used (*), which already has $authenticated defined, and the previous 
value (set by the previous run) is still in it.
However, if it happens that the next request is run by another "fresh" Apache child, that 
one (its own instance of the perl interpreter) does not yet have a pre-compiled version of 
your script, so it gets compiled again, and in that instance $authenticated gets defined 
again, empty.


Since you cannot control which Apache child runs your script the next time you issue a 
request, the result may appear random (as far as $authenticated is concerned).


This "feature" can sometimes be very useful as an optimisation (for example, if you want 
to initialise a complex read-only structure only once per Apache child life), but in the 
general case, it will lead to strange things happening if you are not careful.


So to answer your question : "> how can I tell if things can leak between 
runs?",
a quick answer would be : just *assume* that everything "leaks" between runs, and make 
sure that you initialise every variable in a known way, before using it.


mod_perl is great fun, and the ability to run perl scripts much faster is only the tip of 
the iceberg.  But like every fun thing, it has some minor quirks like that. Do not let 
them discourage you.


(*) that's the point, and that's why it is much faster



Are there other ways to confirm correct operations?

--
Steven Haigh  net...@crc.id.au  https://www.crc.id.au 





RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-08 Thread James Smith
That is a good sign – I would run with brutal at least once and see what it 
throws up

We tend to ignore a couple of the warnings – one is postfix if/unless and the 
other is multiline strings {we embed a lot of simple HTML templates in code and 
it means I can make the HTML readable when rendered rather than being one long 
string, plus SQL queries are more readable and heredocs are messy if you want 
to do concatenation or lots of printf/sprintf calls}

We have it as part of our svn commit pre-hooks so that people can’t push  
code to our repos and break things {one of the reasons we haven’t moved to git! 
as hooks are a bit messier… and people may not have the right software on the 
machines they have their repos on}



From: Steven Haigh 
Sent: 08 February 2021 09:54
To: modperl@perl.apache.org
Subject: RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

On Mon, Feb 8, 2021 at 09:13, James Smith 
mailto:j...@sanger.ac.uk>> wrote:

Use perl-critic this will find most of the nasties that you have the classic is:

Thanks for the tip! I have no idea how long I've been writing stuff in perl - 
and I never knew of this!

I ran it with the -3 option - which I figure is a good middle ground...

The good news, I just ran it over a lot of my code and it seems the only real 
things it picks up are not having a /x on the end of regex matches, using hard 
tabs, and multiline strings. I'd say that's a good sign.

It did pick up a couple of open statements that I didn't have a close for 
(*slaps wrist*), but I haven't seen much in the way of what looks to be major 
issues.

I was trying to find the PBP references - and was amazed that the Perl Best 
Practices *ebook* s $56.20 AUD hahahah

Amazon has a few copies listed second hand, with 3 weeks shipping The joys 
of being on an island a long way from anything ;)

--
Steven Haigh  net...@crc.id.au<mailto:net...@crc.id.au>  
https://www.crc.id.au 
[crc.id.au]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.crc.id.au_=DwMFaQ=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo=oH2yp0ge1ecj4oDX0XM7vQ=e5xV9ANT9Dnmf1ept9VmObBssOhOe3Cci6goE99x62c=bK6vilo7Ud3M2_JhyA9RucR6k68i7T0pLK49RbBQbjY=>



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-08 Thread Steven Haigh

On Mon, Feb 8, 2021 at 09:13, James Smith  wrote:
Use perl-critic this will find most of the nasties that you have the 
classic is:


Thanks for the tip! I have no idea how long I've been writing stuff in 
perl - and I never knew of this!


I ran it with the -3 option - which I figure is a good middle ground...

The good news, I just ran it over a lot of my code and it seems the 
only real things it picks up are not having a /x on the end of regex 
matches, using hard tabs, and multiline strings. I'd say that's a good 
sign.


It did pick up a couple of open statements that I didn't have a close 
for (*slaps wrist*), but I haven't seen much in the way of what looks 
to be major issues.


I was trying to find the PBP references - and was amazed that the Perl 
Best Practices *ebook* s $56.20 AUD hahahah


Amazon has a few copies listed second hand, with 3 weeks shipping 
The joys of being on an island a long way from anything ;)


--
Steven Haigh

 net...@crc.id.au 
 https://www.crc.id.au 



RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-08 Thread James Smith
Use perl-critic this will find most of the nasties that you have the classic is:

my $var = {code} if {condition};

The my gets round perl strict, but $var doesn’t get updated if {condition} 
isn’t met, so holds the variable from the last time round..

Better is

my $var = ‘’;
$var = {code} if {condition};

or

my $var = {condition} ? {code} : ‘’;

From: Steven Haigh 
Sent: 08 February 2021 09:09
To: modperl@perl.apache.org
Subject: Re: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

On Sun, Feb 7, 2021 at 15:17, Chris 
mailto:cpb_mod_p...@bennettconstruction.us>>
 wrote:

Just remember to always write clean code that resets variables after doing 
tasks.

I'm a bit curious about this - whilst I'm still testing all this on a staging 
environment, how can I tell if things can leak between runs?

Is coding to normal 'use strict; use warnings;' standards good enough?

Are there other ways to confirm correct operations?

--
Steven Haigh  net...@crc.id.au<mailto:net...@crc.id.au>  
https://www.crc.id.au 
[crc.id.au]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.crc.id.au_=DwMFaQ=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo=oH2yp0ge1ecj4oDX0XM7vQ=4WgSuD9BTWr09e71eanf57nyBNHOPXE7hNOgLlKJcA4=UVVXHyoyL1iyknEcMp0sf6Nm1yO69SF8h4h4XDhcGeo=>



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

Re: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-08 Thread Steven Haigh
On Sun, Feb 7, 2021 at 15:17, Chris 
 wrote:

Just remember to always write clean code that resets variables after
doing tasks.


I'm a bit curious about this - whilst I'm still testing all this on a 
staging environment, how can I tell if things can leak between runs?


Is coding to normal 'use strict; use warnings;' standards good enough?

Are there other ways to confirm correct operations?

--
Steven Haigh

 net...@crc.id.au 
 https://www.crc.id.au 



RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-07 Thread James Smith
Agree in this - you need to always think that a mod_perl app is running in a 
loop where each loop is an iteration, so if you don't initialise something at 
the start of the script - it can have the value at the end of it's last 
call Use Perl critic it is a good one to find your gotchas...

Try 

my $admin = 1 if $user_is_logged_in and $user eq 'admin';

may seem innocent in as script and compiles even under use strict... but now 
put this in a loop

1st request user is logged in and is admin - this gets set to true.
2nd request user isn't logged in - but this doesn't get reset to "false"

-Original Message-
From: Chris  
Sent: 07 February 2021 21:18
To: modperl@perl.apache.org
Subject: Re: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

On Sun, Feb 07, 2021 at 09:21:41PM +0800, Wesley Peng wrote:
> If you can take time to rewrite all codes with modPerl handlers, that will 
> improve performance a lot.

I've never used Template, but I just eventually wrote all handlers.
I moved from Registry to all handlers, bit by bit.
You can mix Registry and handlers together without any problems.
Just remember to always write clean code that resets variables after doing 
tasks. The same code runs multiple times. Variables might retain old values. 
It's a good habit to keep for all perl code, even outside of mod_perl.
I considered changing all of my mod_perl code to something newer, but I decided 
to just keep it. No regrets.

Chris

> 
> On Sun, Feb 7, 2021, at 9:14 PM, Steven Haigh wrote:
> > In fact, I just realised that 'ab' test is rather restrictive So here's 
> > a bit more of an extended test:
> > 
> > # ab -k -n 1000 -c 32
> > 
> > Apache + ExecCGI:
> > Requests per second:14.26 [#/sec] (mean)
> > Time per request:   2244.181 [ms] (mean)
> > Time per request:   70.131 [ms] (mean, across all concurrent requests)
> > 
> > Apache + mod_perl (ModPerl::PerlRegistry): 
> > Requests per second: 132.14 [#/sec] (mean)
> > Time per request:   242.175 [ms] (mean)
> > Time per request:   7.568 [ms] (mean, across all concurrent requests)
> > 
> > Interestingly, without Keepalives, the story is much the same:
> > 
> > # ab -n 1000 -c 32
> > 
> > Apache + ExecCGI:
> > Requests per second:14.15 [#/sec] (mean)
> > Time per request:   2260.875 [ms] (mean)
> > Time per request:   70.652 [ms] (mean, across all concurrent requests)
> > 
> > Apache + mod_perl (ModPerl::PerlRegistry): 
> > Requests per second:154.48 [#/sec] (mean)
> > Time per request:   207.140 [ms] (mean)
> > Time per request:   6.473 [ms] (mean, across all concurrent requests)
> > 
> > Running some benchmarks across various parts of my site made me realise I 
> > also had some RewriteRules in the apache config that still had H=cgi-script 
> > - changed those to H=perl-script and saw similar improvements:
> > 
> > ExecCGI - Requests per second:11.84 [#/sec] (mean)
> > mod_perl - Requests per second:130.97 [#/sec] (mean)
> > 
> > That's quite some gains for a days work.
> > 
> > --
> > Steven Haigh  net...@crc.id.au  
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.crc.id.au
> > =DwIDaQ=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo=oH2yp0ge1ecj
> > 4oDX0XM7vQ=jUH_UYtcqlwO076IYfxVYrdow4cIntukmuDhs07wzzE=EheaEdaj-
> > 0DWuhH62PQjVTVfPw5WRAazyGkJFaYUC8E=
> > 
> > On Sun, Feb 7, 2021 at 23:58, Steven Haigh  wrote:
> >> Interestingly, I did get things working with ModPerl::PerlRegistry.
> >> 
> >> What I couldn't find *anywhere* is that the data I was loading in Template 
> >> Toolkit was included in the file in the __DATA__ area - which causes 
> >> mod_perl to fall over!
> >> 
> >> The only way I managed to find this was the following error in the 
> >> *system* /var/log/httpd/error_log (didn't show up in the vhost error_log!):
> >> readline() on unopened filehandle DATA at 
> >> /usr/lib64/perl5/vendor_perl/Template/Provider.pm line 638.
> >> 
> >> Took me a LONG time to find a vague post that reading in lines from  
> >> kills mod_perl. Not sure why - but I stripped all the templates out and 
> >> put them in a file instead and re-wrote that bit of code, and things 
> >> started working.
> >> 
> >> I had to fix a few lib path issues, but after getting my head around that, 
> >> most things seem to work as before - however I don't notice much of an 
> >> improvement in execution times, I do see this improvement using 'ab -n 100 
> >> -c32':
> >> 
> >> Apache + ExecC

Re: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-07 Thread Chris
On Sun, Feb 07, 2021 at 09:21:41PM +0800, Wesley Peng wrote:
> If you can take time to rewrite all codes with modPerl handlers, that will 
> improve performance a lot.

I've never used Template, but I just eventually wrote all handlers.
I moved from Registry to all handlers, bit by bit.
You can mix Registry and handlers together without any problems.
Just remember to always write clean code that resets variables after
doing tasks. The same code runs multiple times. Variables might retain
old values. It's a good habit to keep for all perl code, even outside of
mod_perl.
I considered changing all of my mod_perl code to something newer, but I
decided to just keep it. No regrets.

Chris

> 
> On Sun, Feb 7, 2021, at 9:14 PM, Steven Haigh wrote:
> > In fact, I just realised that 'ab' test is rather restrictive So here's 
> > a bit more of an extended test:
> > 
> > # ab -k -n 1000 -c 32
> > 
> > Apache + ExecCGI:
> > Requests per second:14.26 [#/sec] (mean)
> > Time per request:   2244.181 [ms] (mean)
> > Time per request:   70.131 [ms] (mean, across all concurrent requests)
> > 
> > Apache + mod_perl (ModPerl::PerlRegistry): 
> > Requests per second: 132.14 [#/sec] (mean)
> > Time per request:   242.175 [ms] (mean)
> > Time per request:   7.568 [ms] (mean, across all concurrent requests)
> > 
> > Interestingly, without Keepalives, the story is much the same:
> > 
> > # ab -n 1000 -c 32
> > 
> > Apache + ExecCGI:
> > Requests per second:14.15 [#/sec] (mean)
> > Time per request:   2260.875 [ms] (mean)
> > Time per request:   70.652 [ms] (mean, across all concurrent requests)
> > 
> > Apache + mod_perl (ModPerl::PerlRegistry): 
> > Requests per second:154.48 [#/sec] (mean)
> > Time per request:   207.140 [ms] (mean)
> > Time per request:   6.473 [ms] (mean, across all concurrent requests)
> > 
> > Running some benchmarks across various parts of my site made me realise I 
> > also had some RewriteRules in the apache config that still had H=cgi-script 
> > - changed those to H=perl-script and saw similar improvements:
> > 
> > ExecCGI - Requests per second:11.84 [#/sec] (mean)
> > mod_perl - Requests per second:130.97 [#/sec] (mean)
> > 
> > That's quite some gains for a days work.
> > 
> > --
> > Steven Haigh  net...@crc.id.au  https://www.crc.id.au
> > 
> > On Sun, Feb 7, 2021 at 23:58, Steven Haigh  wrote:
> >> Interestingly, I did get things working with ModPerl::PerlRegistry.
> >> 
> >> What I couldn't find *anywhere* is that the data I was loading in Template 
> >> Toolkit was included in the file in the __DATA__ area - which causes 
> >> mod_perl to fall over!
> >> 
> >> The only way I managed to find this was the following error in the 
> >> *system* /var/log/httpd/error_log (didn't show up in the vhost error_log!):
> >> readline() on unopened filehandle DATA at 
> >> /usr/lib64/perl5/vendor_perl/Template/Provider.pm line 638.
> >> 
> >> Took me a LONG time to find a vague post that reading in lines from  
> >> kills mod_perl. Not sure why - but I stripped all the templates out and 
> >> put them in a file instead and re-wrote that bit of code, and things 
> >> started working.
> >> 
> >> I had to fix a few lib path issues, but after getting my head around that, 
> >> most things seem to work as before - however I don't notice much of an 
> >> improvement in execution times, I do see this improvement using 'ab -n 100 
> >> -c32':
> >> 
> >> Apache + ExecCGI: Requests per second:13.50 [#/sec] (mean)
> >> Apache + mod_perl: Requests per second:59.81 [#/sec] (mean)
> >> 
> >> This is obviously a good thing.
> >> 
> >> I haven't gotten into the preload or DBI sharing yet - as that'll end up 
> >> needing a bit of a rewrite of code to take advantage of. I'd be open to 
> >> suggestions here from those who have done it in the past to save me going 
> >> down some dead ends :D
> >> 
> >> --
> >> Steven Haigh  net...@crc.id.au  https://www.crc.id.au
> >> 
> >> On Sun, Feb 7, 2021 at 12:49, James Smith  wrote:
> >>> As welsey said – try Registry, that was the standard way of using 
> >>> mod_perl to cache perl in the server  – but your problem might be due to 
> >>> the note in PerlRun…
> >>> 
> >>> https://perl.apache.org/docs/2.0/api/ModPerl/PerlRun.html#Description
> >>> META: document that for now we don't chdir() into the script's dir, 
> >>> because it affects the whole process under threads. 
> >>> `ModPerl::PerlRunPrefork 
> >>> ` should 
> >>> be used by those who run only under prefork MPM.
> >>> {tbh most people don’t use mod perl under threads anyway as there isn’t 
> >>> really a gain from using them}
> >>> 
> >>> It suggests you use ModPerl/PerlRunPrefork – as this does an additional 
> >>> step to cd to the script directory – which might be your issue….
> 
> >>>  
> 
> >>> *From:* Steven Haigh  
> >>> *Sent:* 07 February 2021 01:00
> >>> *To:* modperl@perl.apache.org
> 

Re: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-07 Thread Wesley Peng
If you can take time to rewrite all codes with modPerl handlers, that will 
improve performance a lot.

On Sun, Feb 7, 2021, at 9:14 PM, Steven Haigh wrote:
> In fact, I just realised that 'ab' test is rather restrictive So here's a 
> bit more of an extended test:
> 
> # ab -k -n 1000 -c 32
> 
> Apache + ExecCGI:
> Requests per second:14.26 [#/sec] (mean)
> Time per request:   2244.181 [ms] (mean)
> Time per request:   70.131 [ms] (mean, across all concurrent requests)
> 
> Apache + mod_perl (ModPerl::PerlRegistry): 
> Requests per second: 132.14 [#/sec] (mean)
> Time per request:   242.175 [ms] (mean)
> Time per request:   7.568 [ms] (mean, across all concurrent requests)
> 
> Interestingly, without Keepalives, the story is much the same:
> 
> # ab -n 1000 -c 32
> 
> Apache + ExecCGI:
> Requests per second:14.15 [#/sec] (mean)
> Time per request:   2260.875 [ms] (mean)
> Time per request:   70.652 [ms] (mean, across all concurrent requests)
> 
> Apache + mod_perl (ModPerl::PerlRegistry): 
> Requests per second:154.48 [#/sec] (mean)
> Time per request:   207.140 [ms] (mean)
> Time per request:   6.473 [ms] (mean, across all concurrent requests)
> 
> Running some benchmarks across various parts of my site made me realise I 
> also had some RewriteRules in the apache config that still had H=cgi-script - 
> changed those to H=perl-script and saw similar improvements:
> 
> ExecCGI - Requests per second:11.84 [#/sec] (mean)
> mod_perl - Requests per second:130.97 [#/sec] (mean)
> 
> That's quite some gains for a days work.
> 
> --
> Steven Haigh  net...@crc.id.au  https://www.crc.id.au
> 
> On Sun, Feb 7, 2021 at 23:58, Steven Haigh  wrote:
>> Interestingly, I did get things working with ModPerl::PerlRegistry.
>> 
>> What I couldn't find *anywhere* is that the data I was loading in Template 
>> Toolkit was included in the file in the __DATA__ area - which causes 
>> mod_perl to fall over!
>> 
>> The only way I managed to find this was the following error in the *system* 
>> /var/log/httpd/error_log (didn't show up in the vhost error_log!):
>> readline() on unopened filehandle DATA at 
>> /usr/lib64/perl5/vendor_perl/Template/Provider.pm line 638.
>> 
>> Took me a LONG time to find a vague post that reading in lines from  
>> kills mod_perl. Not sure why - but I stripped all the templates out and put 
>> them in a file instead and re-wrote that bit of code, and things started 
>> working.
>> 
>> I had to fix a few lib path issues, but after getting my head around that, 
>> most things seem to work as before - however I don't notice much of an 
>> improvement in execution times, I do see this improvement using 'ab -n 100 
>> -c32':
>> 
>> Apache + ExecCGI: Requests per second:13.50 [#/sec] (mean)
>> Apache + mod_perl: Requests per second:59.81 [#/sec] (mean)
>> 
>> This is obviously a good thing.
>> 
>> I haven't gotten into the preload or DBI sharing yet - as that'll end up 
>> needing a bit of a rewrite of code to take advantage of. I'd be open to 
>> suggestions here from those who have done it in the past to save me going 
>> down some dead ends :D
>> 
>> --
>> Steven Haigh  net...@crc.id.au  https://www.crc.id.au
>> 
>> On Sun, Feb 7, 2021 at 12:49, James Smith  wrote:
>>> As welsey said – try Registry, that was the standard way of using mod_perl 
>>> to cache perl in the server  – but your problem might be due to the note in 
>>> PerlRun…
>>> 
>>> https://perl.apache.org/docs/2.0/api/ModPerl/PerlRun.html#Description
>>> META: document that for now we don't chdir() into the script's dir, because 
>>> it affects the whole process under threads. `ModPerl::PerlRunPrefork 
>>> ` should 
>>> be used by those who run only under prefork MPM.
>>> {tbh most people don’t use mod perl under threads anyway as there isn’t 
>>> really a gain from using them}
>>> 
>>> It suggests you use ModPerl/PerlRunPrefork – as this does an additional 
>>> step to cd to the script directory – which might be your issue….

>>>  

>>> *From:* Steven Haigh  
>>> *Sent:* 07 February 2021 01:00
>>> *To:* modperl@perl.apache.org
>>> *Subject:* Moving ExecCGI to mod_perl - performance and custom 'modules' 
>>> [EXT]

>>>  

>>> Hi all,

>>>  

>>> So for many years I've been slack and writing perl scripts to do various 
>>> things - but never needed more than the normal apache +ExecCGI and Template 
>>> Toolkit.

>>>  

>>> One of my sites has become a bit more popular, so I'd like to spend a bit 
>>> of time on performance. Currently, I'm seeing ~300-400ms of what I believe 
>>> to be execution time of the script loading, running, and then blatting its 
>>> output to STDOUT and the browser can go do its thing. 

>>>  

>>> I believe most of the delay would be to do with loading perl, its modules 
>>> etc etc

>>>  

>>> I know that the current trend would be to re-write the entire site in a 
>>> more modern,