stas 2002/09/01 23:34:51
Modified: src/docs/2.0/user config.cfg
src/docs/2.0/user/config config.pod
src/docs/2.0/user/install install.pod
Added: src/docs/2.0/user/handlers filters.pod http.pod intro.pod
protocols.pod server.pod
Removed: src/docs/2.0/user/handlers handlers.pod
Log:
split the handlers chapter into several chapters for each topic
Revision Changes Path
1.13 +10 -2 modperl-docs/src/docs/2.0/user/config.cfg
Index: config.cfg
===================================================================
RCS file: /home/cvs/modperl-docs/src/docs/2.0/user/config.cfg,v
retrieving revision 1.12
retrieving revision 1.13
diff -u -r1.12 -r1.13
--- config.cfg 13 Aug 2002 11:39:02 -0000 1.12
+++ config.cfg 2 Sep 2002 06:34:51 -0000 1.13
@@ -21,11 +21,19 @@
config/config.pod
)],
- group => 'Coding Techniques',
+ group => 'Coding',
chapters => [qw(
- handlers/handlers.pod
compat/compat.pod
coding/coding.pod
+ )],
+
+ group => 'mod_perl Handlers',
+ chapters => [qw(
+ handlers/intro.pod
+ handlers/server.pod
+ handlers/protocols.pod
+ handlers/http.pod
+ handlers/filters.pod
)],
group => 'Troubleshooting',
1.23 +7 -7 modperl-docs/src/docs/2.0/user/config/config.pod
Index: config.pod
===================================================================
RCS file: /home/cvs/modperl-docs/src/docs/2.0/user/config/config.pod,v
retrieving revision 1.22
retrieving revision 1.23
diff -u -r1.22 -r1.23
--- config.pod 2 Sep 2002 03:38:50 -0000 1.22
+++ config.pod 2 Sep 2002 06:34:51 -0000 1.23
@@ -673,7 +673,7 @@
PerlPreConnectionHandler ITERATE SRV
PerlProcessConnectionHandler ITERATE SRV
-
+
PerlPostReadRequestHandler ITERATE SRV
PerlTransHandler ITERATE SRV
PerlInitHandler ITERATE DIR
@@ -686,7 +686,7 @@
PerlResponseHandler ITERATE DIR
PerlLogHandler ITERATE DIR
PerlCleanupHandler ITERATE DIR
-
+
PerlInputFilterHandler ITERATE DIR
PerlOutputFilterHandler ITERATE DIR
@@ -756,9 +756,9 @@
=item DIR
C<E<lt>DirectoryE<gt>>, C<E<lt>LocationE<gt>>, C<E<lt>FilesE<gt>> and
-all their regular expression variants (mnemonic: I<DIRectory>). These
directives
-can also appear in I<.htaccess> files. These directives are defined
-as C<OR_ALL> in the source code.
+all their regular expression variants (mnemonic: I<DIRectory>). These
+directives can also appear in I<.htaccess> files. These directives
+are defined as C<OR_ALL> in the source code.
These directives can also appear in the global server configuration
and C<E<lt>VirtualHostE<gt>>.
@@ -769,8 +769,8 @@
used by the core mod_perl directives and their definition can be found
in I<include/httpd_config.h> (hint: search for C<RSRC_CONF>).
-Also see L<Perl*Handler
-Types|docs::2.0::user::handlers::handlers/Perl_Handler_Types>.
+Also see L<Single Phase's Multiple Handlers
+Behavior|docs::2.0::user::handlers::intro/Single_Phase_s_Multiple_Handlers_Behavior>.
1.1 modperl-docs/src/docs/2.0/user/handlers/filters.pod
Index: filters.pod
===================================================================
=head1 NAME
Input and Output Filters
=head1 Description
This chapter discusses mod_perl's input and output filter handlers.
=head1 I/O Filtering
Apache 2.0 considers all incoming and outgoing data as chunks of
information, disregarding their kind and source or storage
methods. These data chunks are stored in I<buckets>, which form
I<bucket brigades>. Both input and output filters filter the data in
bucket brigades.
=head2 PerlInputFilterHandler
The C<PerlInputFilterHandler> handler registers a filter for input
filtering.
This handler is of type
C<L<VOID|docs::2.0::user::handlers::intro/item_VOID>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
The following sections include several examples that use the
C<PerlInputFilterHandler> handler.
=head2 PerlOutputFilterHandler
The C<PerlOutputFilterHandler> handler registers and configures output
filters.
This handler is of type
C<L<VOID|docs::2.0::user::handlers::intro/item_VOID>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
The following sections include several examples that use the
C<PerlOutputFilterHandler> handler.
=head2 Connection vs. HTTP Request Filters
Currently the mod_perl filters allow connection and request level
filtering. Apache supports several other types, which mod_perl 2.0
will probably support in the future. mod_perl filter handlers specify
the type of the filter using the method attributes.
Request filter handlers are declared using the C<FilterRequestHandler>
attribute. Consider the following request input and output filters
skeleton:
package MyApache::FilterRequestFoo;
use base qw(Apache::Filter);
sub input : FilterRequestHandler {
my($filter, $bb, $mode, $block, $readbytes) = @_;
#...
}
sub output : FilterRequestHandler {
my($filter, $bb) = @_;
#...
}
1;
If the attribute is not specified, the default C<FilterRequestHandler>
attribute is assumed. Filters specifying subroutine attributes must
subclass C<Apache::Filter>, others only need to:
use Apache::Filter ();
The request filters are usually configured in the
C<E<lt>LocationE<gt>> or equivalent sections:
PerlModule MyApache::FilterRequestFoo
PerlModule MyApache::NiceResponse
<Location /filter_foo>
SetHandler modperl
PerlResponseHandler MyApache::NiceResponse
PerlInputFilterHandler MyApache::FilterRequestFoo::input
PerlOutputFilterHandler MyApache::FilterRequestFoo::output
</Location>
Now we have the request input and output filters configured.
The connection filter handler uses the C<FilterConnectionHandler>
attribute. Here is a similar example for the connection input and
output filters.
package MyApache::FilterConnectionBar;
use base qw(Apache::Filter);
sub input : FilterConnectionHandler {
my($filter, $bb, $mode, $block, $readbytes) = @_;
#...
}
sub output : FilterConnectionHandler {
my($filter, $bb) = @_;
#...
}
1;
This time the configuration must be done outside the
C<E<lt>LocationE<gt>> or equivalent sections, usually within the
C<E<lt>VirtualHostE<gt>> or the global server configuration:
Listen 8005
<VirtualHost _default_:8005>
PerlModule MyApache::FilterConnectionBar
PerlModule MyApache::NiceResponse
PerlInputFilterHandler MyApache::FilterConnectionBar::input
PerlOutputFilterHandler MyApache::FilterConnectionBar::output
<Location />
SetHandler modperl
PerlResponseHandler MyApache::NiceResponse
</Location>
</VirtualHost>
This accomplishes the configuration of the connection input and output
filters.
Notice that for HTTP requests the only difference between connection
filters and request filters is that the former see everything: the
headers and the body, whereas the latter see only the body.
[META: This belongs to the Apache::Filter manpage and should be moved
there when this page is created.
Inside a connection filter the current connection object can be
retrieved with:
my $c = $filter->c;
Inside a request filter the current request object can be retrieved
with:
my $r = $filter->r;
]
mod_perl provides two interfaces to filtering: a direct bucket
brigades manipulation interface and a simpler, stream-oriented
interface (XXX: as of this writing the latter is available only for
the output filtering). The examples in the following sections will
help you to understand the difference between the two interfaces.
=head1 All-in-One Filter
Before we delve into the details of how to write filters that do
something with the data, lets first write a simple filter that does
nothing but snooping on the data that goes through it. We are going to
develop the C<MyApache::FilterSnoop> handler which can snoop on
request and connection filters, in input and output modes.
But first let's develop a simple response handler that simply dumps
the request's I<args> and I<content> as strings:
file:MyApache/Dump.pm
---------------------
package MyApache::Dump;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::RequestIO ();
use Apache::Const -compile => qw(OK M_POST);
sub handler {
my $r = shift;
$r->content_type('text/plain');
$r->print("args:\n", $r->args, "\n");
if ($r->method_number == Apache::M_POST) {
my $data = content($r);
$r->print("content:\n$data\n");
}
return Apache::OK;
}
sub content {
my $r = shift;
$r->setup_client_block;
return '' unless $r->should_client_block;
my $len = $r->headers_in->get('content-length');
my $buf;
$r->get_client_block($buf, $len);
return $buf;
}
1;
which is configured as:
PerlModule MyApache::Dump
<Location /dump>
SetHandler modperl
PerlResponseHandler MyApache::Dump
</Location>
If we issue the following request:
% echo "mod_perl rules" | POST 'http://localhost:8002/dump?foo=1&bar=2'
the response will be:
args:
foo=1&bar=2
content:
mod_perl rules
As you can see it simply dumped the query string and the posted data.
Now let's write the snooping filter:
file:MyApache/FilterSnoop.pm
----------------------------
package MyApache::FilterSnoop;
use strict;
use warnings;
use base qw(Apache::Filter);
use Apache::FilterRec ();
use APR::Brigade ();
use Apache::Const -compile => qw(OK DECLINED);
use APR::Const -compile => ':common';
sub connection : FilterConnectionHandler { snoop("connection", @_) }
sub request : FilterRequestHandler { snoop("request", @_) }
sub snoop {
my $type = shift;
my($filter, $bb, $mode, $block, $readbytes) = @_; # filter args
# $mode, $block, $readbytes are passed only for input filters
my $stream = defined $mode ? "input" : "output";
# read the data and pass-through the bucket brigades unchanged
my $ra_data = '';
if (defined $mode) {
# input filter
my $rv = $filter->next->get_brigade($bb, $mode, $block, $readbytes);
return $rv unless $rv == APR::SUCCESS;
$ra_data = bb_sniff($bb);
}
else {
# output filter
$ra_data = bb_sniff($bb);
my $rv = $filter->next->pass_brigade($bb);
return $rv unless $rv == APR::SUCCESS;
}
# send the sniffed info to STDERR so not to interfere with normal
# output
my $direction = $stream eq 'output' ? ">>>" : "<<<";
print STDERR "\n$direction $type $stream filter\n";
my $c = 1;
while (my($btype, $data) = splice @$ra_data, 0, 2) {
print STDERR " o bucket $c: $btype\n";
print STDERR "[$data]\n";
$c++;
}
return Apache::OK;
}
sub bb_sniff {
my $bb = shift;
my @data;
for (my $b = $bb->first; $b; $b = $bb->next($b)) {
$b->read(my $bdata);
$bdata = '' unless defined $bdata;
push @data, $b->type->name, $bdata;
}
return [EMAIL PROTECTED];
}
1;
This package provides two filter handlers, one for connection and
another for request filtering:
sub connection : FilterConnectionHandler { snoop("connection", @_) }
sub request : FilterRequestHandler { snoop("request", @_) }
Both handlers forward their arguments to the C<snoop()> function that
does the real job. We needed to add these two subroutines in order to
assign the two different attributes. Plus the functions pass the
filter type to C<snoop()> as the first argument, which gets shifted
off C<@_> and the rest of the C<@_> are the arguments that were
originally passed to the filter handler.
It's easy to know whether a filter handler is running in the input or
the output mode. The arguments C<$filter> and C<$bb> are always
passed, whereas the arguments C<$mode>, C<$block>, and C<$readbytes>
are passed only to input filter handlers.
If we are in the input mode, we retrieve the bucket brigade and
immediately link it to C<$bb> which makes the brigade available to the
next filter. When this filter handler returns, the next filter on the
stack will get the brigade. If we forget to perform this linking our
filter will become a black hole in which data simply disappears. Next
we call C<bb_sniff()> which returns the type and the content of the
buckets in the brigade.
If we are in the output mode, C<$bb> already points to the current
bucket brigade. Therefore we can read the contents of the brigade
right away. After that we pass the brigade to the next filter.
Finally we dump to STDERR the information about the type of the
current mode, and the content of the bucket bridge.
Let's snoop on connection and request filter levels in both
directions by applying the following configuration:
Listen 8008
<VirtualHost _default_:8008>
PerlModule MyApache::FilterSnoop
PerlModule MyApache::Dump
# Connection filters
PerlInputFilterHandler MyApache::FilterSnoop::connection
PerlOutputFilterHandler MyApache::FilterSnoop::connection
<Location /dump>
SetHandler modperl
PerlResponseHandler MyApache::Dump
# Request filters
PerlInputFilterHandler MyApache::FilterSnoop::request
PerlOutputFilterHandler MyApache::FilterSnoop::request
</Location>
</VirtualHost>
Notice that we use a virtual host because we want to install
connection filters.
If we issue the following request:
% echo "mod_perl rules" | POST 'http://localhost:8008/dump?foo=1&bar=2'
We get the same response, because our snooping filter didn't change
anything. Though there was a lot of output printed to I<error_log>. We
present it all here, since it helps a lot to understand how filters
work.
First we can see the connection input filter at work, as it processes
the HTTP headers. We can see that for this request each header is put
into a separate brigade with a single bucket. The data is conveniently
enclosed by C<[]> so you can see the new line characters as well.
<<< connection input filter
o bucket 1: HEAP
[POST /dump?foo=1&bar=2 HTTP/1.1
]
<<< connection input filter
o bucket 1: HEAP
[TE: deflate,gzip;q=0.3
]
<<< connection input filter
o bucket 1: HEAP
[Connection: TE, close
]
<<< connection input filter
o bucket 1: HEAP
[Host: localhost:8008
]
<<< connection input filter
o bucket 1: HEAP
[User-Agent: lwp-request/2.01
]
<<< connection input filter
o bucket 1: HEAP
[Content-Length: 14
]
<<< connection input filter
o bucket 1: HEAP
[Content-Type: application/x-www-form-urlencoded
]
<<< connection input filter
o bucket 1: HEAP
[
]
Here the HTTP header has been terminated by a double new line. So far
all the buckets were of the I<HEAP> type, meaning that they were
allocated from the heap memory. Notice that the request input filters
will never see the bucket brigade with HTTP header, it has been
consumed by the last connection Apache core handler.
The following two entries are generated when
C<MyApache::Dump::handler> reads the POSTed content:
<<< connection input filter
o bucket 1: HEAP
[mod_perl rules]
<<< request input filter
o bucket 1: HEAP
[mod_perl rules]
o bucket 2: EOS
[]
as we saw earlier on the diagram, the connection input filter is run
before the request input filter. Since our connection input filter was
passing the data through unmodified and no other connection input
filter was configured, the request input filter sees the same
data. The last bucket in the brigade received by the request input
filter is of type I<EOS>, meaning that all the input data from the
current request has been received.
Next we can see that C<MyApache::Dump::handler> has generated its
response. However only the request output filter is filtering it at
this point:
>>> request output filter
o bucket 1: TRANSIENT
[args:
foo=1&bar=2
content:
mod_perl rules
]
This happens because Apache hasn't sent yet the response HTTP headers
to the client. Apache postpones the header sending so it can calculate
and set the C<Content-Length> header. This time the brigade consists
of a single bucket of type I<TRANSIENT> which is allocated from the
stack memory, which will eventually be converted to the I<HEAP> type,
before the body of the response is sent to the client.
When the content handler returns Apache sends the HTTP headers through
connection output filters (notice that the request output filters
don't see it):
>>> connection output filter
o bucket 1: HEAP
[HTTP/1.1 200 OK
Date: Wed, 14 Aug 2002 07:31:53 GMT
Server: Apache/2.0.41-dev (Unix) mod_perl/1.99_05-dev
Perl/v5.8.0 mod_ssl/2.0.41-dev OpenSSL/0.9.6d DAV/2
Content-Length: 42
Connection: close
Content-Type: text/plain; charset=ISO-8859-1
]
Now the response body in the bucket of type I<HEAP> is passed through
the connection output filter, followed by the I<EOS> bucket to mark
the end of the request:
>>> connection output filter
o bucket 1: HEAP
[args:
foo=1&bar=2
content:
mod_perl rules
]
o bucket 2: EOS
[]
Finally the output is flushed, to make sure that any buffered output
is sent to the client:
>>> connection output filter
o bucket 1: FLUSH
[]
This module helps to understand that each filter handler can be called
many time during each request and connection. It's called for each
bucket brigade.
Also it's important to notice that the request input filter is called
only if there is some POSTed data to read, if you run the same request
without POSTing any data or simply running a GET request, the request
input filter won't be called.
=head1 Input Filters
mod_perl supports L<Connection|/Connection_Input_Filters> and L<HTTP
Request|/HTTP_Request_Input_Filters> input filters:
=head2 Connection Input Filters
Let's say that we want to test how our handlers behave when they are
requested as C<HEAD> requests, rather than C<GET>. We can alter the
request headers at the incoming connection level transparently to all
handlers. So here is the input filter handler that does that by
directly manipulating the bucket brigades:
file:MyApache/InputFilterGET2HEAD.pm
-----------------------------------
package MyApache::InputFilterGET2HEAD;
use strict;
use warnings;
use base qw(Apache::Filter);
use Apache::RequestRec ();
use Apache::RequestIO ();
use APR::Brigade ();
use APR::Bucket ();
use Apache::Const -compile => 'OK';
use APR::Const -compile => ':common';
sub handler : FilterConnectionHandler {
my($filter, $bb, $mode, $block, $readbytes) = @_;
my $c = $filter->c;
my $ctx_bb = APR::Brigade->new($c->pool, $c->bucket_alloc);
my $rv = $filter->next->get_brigade($ctx_bb, $mode, $block, $readbytes);
return $rv unless $rv == APR::SUCCESS;
while (!$ctx_bb->empty) {
my $bucket = $ctx_bb->first;
$bucket->remove;
if ($bucket->is_eos) {
$bb->insert_tail($bucket);
last;
}
my $data;
my $status = $bucket->read($data);
return $status unless $status == APR::SUCCESS;
if ($data and $data =~ s|^GET|HEAD|) {
$bucket = APR::Bucket->new($data);
}
$bb->insert_tail($bucket);
}
Apache::OK;
}
1;
The filter handler is called for each bucket brigade, which in turn
includes buckets with data. The gist of any filter handler is to
retrieve the bucket brigade sent from the previous filter, prepare a
new empty brigade, and move buckets from the former brigade to the
latter optionally modifying the buckets on the way, which may include
removing or adding new buckets. Of course if the filter doesn't want
to modify any of the buckets it may decide to pass through the
original brigade without doing any work.
In our example the handler first removes the bucket at the top of the
brigade and looks at its type. If it sees an end of stream, that
removed bucket is linked to the tail of the bucket brigade that will
go to the next filter and it doesn't attempt to read any more
buckets. If this event doesn't happen the handler reads the data from
that bucket and if it finds that the data is of interest to us, it
modifies the data, creates a new bucket using the modified data and
links it to the tail of the outgoing brigade, while discarding the
original bucket. In our case the interesting data is a such that
matches the regular expression C</^GET/>. If the data is not interesting to
the
handler, it simply links the unmodified bucket to the outgoing
brigade.
The handler looks for data like:
GET /perl/test.pl HTTP/1.1
and turns it into:
HEAD /perl/test.pl HTTP/1.1
For example, consider the following response handler:
file:MyApache/RequestType.pm
---------------------------
package MyApache::RequestType;
use strict;
use warnings;
use Apache::Const -compile => 'OK';
sub handler {
my $r = shift;
$r->content_type('text/plain');
$r->print("the request type was " . $r->method);
Apache::OK;
}
1;
which returns to the client the request type it has issued. In the
case of the C<HEAD> request Apache will discard the response body, but
it'll will still set the correct C<Content-Length> header, which will
be 24 in case of the C<GET> request and 25 for C<HEAD>. Therefore if
this response handler is configured as:
Listen 8005
<VirtualHost _default_:8005>
<Location />
SetHandler modperl
PerlResponseHandler +MyApache::RequestType
</Location>
</VirtualHost>
and a C<GET> request is issued to I</>:
panic% perl -MLWP::UserAgent -le \
'$r = LWP::UserAgent->new()->get("http://localhost:8005/"); \
print $r->headers->content_length . ": ". $r->content'
24: the request type was GET
where the response's body is:
the request type was GET
And the C<Content-Length> header is set to 24.
However if we enable the C<MyApache::InputFilterGET2HEAD> input
connection filter:
Listen 8005
<VirtualHost _default_:8005>
PerlInputFilterHandler +MyApache::InputFilterGET2HEAD
<Location />
SetHandler modperl
PerlResponseHandler +MyApache::RequestType
</Location>
</VirtualHost>
And issue the same C<GET> request, we get only:
25:
which means that the body was discarded by Apache, because our filter
turned the C<GET> request into a C<HEAD> request and if Apache wasn't
discarding the body on C<HEAD>, the response would be:
the request type was HEAD
that's why the content length is reported as 25 and not 24 as in the
real GET request.
=head2 HTTP Request Input Filters
Request filters are really non-different from connection filters,
other than that they are working on request and response bodies and
have an access to a request object. The filter implementation is
pretty much identical. Let's look at the request input filter that
lowercases the request's body C<MyApache::InputRequestFilterLC>:
file:MyApache/InputRequestFilterLC.pm
-------------------------------------
package MyApache::InputRequestFilterLC;
use strict;
use warnings;
use base qw(Apache::Filter);
use APR::Brigade ();
use APR::Bucket ();
use Apache::Const -compile => 'OK';
use APR::Const -compile => ':common';
sub handler : FilterRequestHandler {
my($filter, $bb, $mode, $block, $readbytes) = @_;
my $c = $filter->c;
my $bb_ctx = APR::Brigade->new($c->pool, $c->bucket_alloc);
my $rv = $filter->next->get_brigade($bb_ctx, $mode, $block, $readbytes);
return $rv unless $rv == APR::SUCCESS;
while (!$bb_ctx->empty) {
my $b = $bb_ctx->first;
$b->remove;
if ($b->is_eos) {
$bb->insert_tail($b);
last;
}
my $data;
my $status = $b->read($data);
return $status unless $status == APR::SUCCESS;
$b = APR::Bucket->new(lc $data) if $data;
$bb->insert_tail($b);
}
Apache::OK;
}
1;
Now if we use the C<MyApache::Dump> response handler, we have
developed before in this chapter, which dumps the query string and the
content body as a response, and configure the server as follows:
<Location /lc_input>
SetHandler modperl
PerlResponseHandler +MyApache::Dump
PerlInputFilterHandler +MyApache::InputRequestFilterLC
</Location>
When issuing a POST request:
% echo "mOd_pErl RuLeS" | POST 'http://localhost:8002/dump_input?FoO=1&BAR=2'
we get a response:
args:
FoO=1&BAR=2
content:
mod_perl rules
indeed we can see that our filter has lowercased the POSTed body,
before the content handler received it. You can see that the query
string wasn't changed.
=head1 Output Filters
mod_perl supports L<Connection|/Connection_Output_Filters> and L<HTTP
Request|/HTTP_Request_Output_Filters> output filters:
=head2 Connection Output Filters
Connection filters filter B<all> the data that is going through the
server. Therefore if the connection is of HTTP request type,
connection output filters see the headers and the body of the
response, whereas request output filters see only the response body.
META: for now see the request output filter explanations and examples,
connection output filter examples will be added soon. Interesting
ideas for such filters are welcome (mainly for mungling output headers
I suppose).
=head2 HTTP Request Output Filters
As mentioned earlier output filters can be written using the bucket
brigades manipulation or the simplified stream-oriented interface.
First let's develop a response handler that send two lines of output:
numerals 0-9 and the English alphabet in a single string:
file:MyApache/SendAlphaNum.pm
-------------------------------
package MyApache::SendAlphaNum;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::RequestIO ();
use Apache::Const -compile => qw(OK);
sub handler {
my $r = shift;
$r->content_type('text/plain');
$r->print(0..9, "0\n");
$r->print('a'..'z', "\n");
Apache::OK;
}
1;
The purpose of our request output filter is to reverse every line of
the response, preserving the new line characters in their
places. Since we want to reverse characters only in the response body
we will use the request output filters.
=head3 Stream-oriented Output Filter
The first filter implementation is using the stream-oriented filtering
API:
file:MyApache/FilterReverse1.pm
----------------------------
package MyApache::FilterReverse1;
use strict;
use warnings;
use Apache::Filter ();
use Apache::Const -compile => qw(OK);
use constant BUFF_LEN => 1024;
sub handler : FilterRequestHandler {
my $filter = shift;
while ($filter->read(my $buffer, BUFF_LEN)) {
for (split "\n", $buffer) {
$filter->print(scalar reverse $_);
$filter->print("\n");
}
}
Apache::OK;
}
1;
Next, we add the following configuration to I<httpd.conf>:
PerlModule MyApache::FilterReverse1
PerlModule MyApache::SendAlphaNum
<Location /reverse1>
SetHandler modperl
PerlResponseHandler MyApache::SendAlphaNum
PerlOutputFilterHandler MyApache::FilterReverse1
</Location>
Now when a request to I</reverse1> is made, the response handler
C<MyApache::SendAlphaNum::handler()> sends:
1234567890
abcdefghijklmnopqrstuvwxyz
as a response and the output filter handler
C<MyApache::FilterReverse1::handler> reverses the lines, so the client
gets:
0987654321
zyxwvutsrqponmlkjihgfedcba
The C<Apache::Filter> module loads the C<read()> and C<print()>
methods which encapsulate the stream-oriented filtering interface.
The reversing filter is quite simple: in the loop it reads the data in
the I<readline()> mode in chunks up to the buffer length (1024 in our
example), and then prints each line reversed while preserving the new
line control characters at the end of each line. Behind the scenes
C<$filter-E<gt>read()> retrieves the incoming brigade and gets the
data from it, whereas C<$filter-E<gt>print()> appends to the new
brigade which is then sent to the next filter in the stack. C<read()>
breaks the while loop, when the brigade is emptied or the end of
stream is received.
In order not to distract the reader from the purpose of the example
the used code is oversimplified and won't handle correctly input lines
which are longer than 1024 characters and possibly using a different
line termination pattern. So here is an example of a more complete
handler, which does takes care of these issues:
sub handler {
my $filter = shift;
my $left_over = '';
while ($filter->read(my $buffer, BUFF_LEN)) {
$buffer = $left_over . $buffer;
$left_over = '';
while ($buffer =~ /([^\r\n]*)([\r\n]*)/g) {
$left_over = $1, last unless $2;
$filter->print(scalar(reverse $1), $2);
}
}
$filter->print(scalar reverse $left_over) if length $left_over;
Apache::OK;
}
In this handler the lines longer than the buffer's length are buffered
up in C<$left_over> and processed only when the whole line is read in,
or if there is no more input the buffered up text is flushed before
the end of the handler.
=head3 Bucket Brigade-based Output Filters
The second filter implementation is using the bucket brigades API to
accomplish exactly the same task as the first filter.
package MyApache::FilterReverse2;
use strict;
use warnings;
use Apache::Filter;
use APR::Brigade ();
use APR::Bucket ();
use Apache::Const -compile => 'OK';
use APR::Const -compile => ':common';
sub handler : FilterRequestHandler {
my($filter, $bb) = @_;
my $c = $filter->c;
my $bb_ctx = APR::Brigade->new($c->pool, $c->bucket_alloc);
while (!$bb->empty) {
my $bucket = $bb->first;
$bucket->remove;
if ($bucket->is_eos) {
$bb_ctx->insert_tail($bucket);
last;
}
my $data;
my $status = $bucket->read($data);
return $status unless $status == APR::SUCCESS;
if ($data) {
$data = join "",
map {scalar(reverse $_), "\n"} split "\n", $data;
$bucket = APR::Bucket->new($data);
}
$bb_ctx->insert_tail($bucket);
}
my $rv = $filter->next->pass_brigade($bb_ctx);
return $rv unless $rv == APR::SUCCESS;
Apache::OK;
}
1;
and the corresponding configuration:
PerlModule MyApache::FilterReverse2
PerlModule MyApache::SendAlphaNum
<Location /reverse2>
SetHandler modperl
PerlResponseHandler MyApache::SendAlphaNum
PerlOutputFilterHandler MyApache::FilterReverse2
</Location>
Now when a request to I</reverse2> is made, the client gets:
0987654321
zyxwvutsrqponmlkjihgfedcba
as expected.
The bucket brigades output filter version is just a bit more
complicated than the stream-oriented one. The handler receives the
incoming bucket brigade C<$bb> as its second argument. Since when the
handler is completed it must pass a brigade to the next filter in the
stack, we create a new bucket brigade into which we are going to put
the modified buckets and which eventually we pass to the next filter.
The core of the handler is in removing buckets from the head of the
bucket brigade C<$bb> while there are some, reading the data from the
buckets, reversing and putting it into a newly created bucket which is
inserted to the end of the new bucket brigade. If we see a bucket
which designates the end of stream, we insert that bucket to the tail
of the new bucket brigade and break the loop. Finally we pass the
created brigade with modified data to the next filter and return.
=head1 Filter Tips and Tricks
Various tips to use in filters.
=head2 Altering the Content-Type Response Header
Let's say that you want to modify the C<Content-Type> header in the
request output filter:
sub handler : FilterRequestHandler {
my $filter = shift;
...
$filter->r->content_type("text/html; charset=$charset");
...
Request filters have an access to the request object, so we simply
modify it.
=head1 Maintainers
Maintainer is the person(s) you should contact with updates,
corrections and patches.
=over
=item *
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=back
=head1 Authors
=over
=item *
=back
Only the major authors are listed above. For contributors see the
Changes file.
=cut
1.1 modperl-docs/src/docs/2.0/user/handlers/http.pod
Index: http.pod
===================================================================
=head1 NAME
HTTP Handlers
=head1 Description
This chapter explains how to implement the HTTP protocol handlers in
mod_perl.
=head1 HTTP Request Cycle Phases
Those familiar with mod_perl 1.0 will find the HTTP request cycle in
mod_perl 2.0 to be almost identical to the mod_perl 1.0's model. The
only difference is in the I<response> phase which now includes
filtering. Also the C<PerlHandler> directive has been renamed to
C<PerlResponseHandler> to better match the corresponding Apache phase
name (I<response>).
The following diagram depicts the HTTP request life cycle and
highlights which handlers are available to mod_perl 2.0:
=for html
<img src="http_cycle.gif" width="600" height="560"
align="center" valign="middle" alt="HTTP cycle"><br><br>
From the diagram it can be seen that an HTTP request is processes by
11 phases, executed in the following order:
=over
=item 1 PerlPostReadRequestHandler (PerlInitHandler)
=item 2 PerlTransHandler
=item 3 PerlHeaderParserHandler (PerlInitHandler)
=item 4 PerlAccessHandler
=item 5 PerlAuthenHandler
=item 6 PerlAuthzHandler
=item 7 PerlTypeHandler
=item 8 PerlFixupHandler
=item 9 PerlResponseHandler
=item 10 PerlLogHandler
=item 11 PerlCleanupHandler
=back
It's possible that the cycle will not be completed if any of the
phases terminates it, usually when an error happens.
Notice that when the response handler is reading the input data it can
be filtered through request input filters, which are preceded by
connection input filters if any. Similarly the generated response is
first run through request output filters and eventually through
connection output filters before it's sent to the client. We will talk
about filters in detail later in this chapter.
Now let's discuss each of the mentioned handlers in detail.
=head2 PerlPostReadRequestHandler
The I<post_read_request> phase is the first request phase and happens
immediately after the request has been read and HTTP headers were
parsed.
This phase is usually used to do processing that must happen once per
request. For example C<Apache::Reload> is usually invoked at this
phase to reload modified Perl modules.
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The handler's configuration scope is
C<L<SRV|docs::2.0::user::config::config/item_SRV>>, because at this
phase the request has not yet been associated with a particular
filename or directory.
Now, let's look at an example. Consider the following registry script:
touch.pl
--------
use strict;
use warnings;
use Apache::ServerUtil ();
use File::Spec::Functions qw(catfile);
my $r = shift;
$r->content_type('text/plain');
my $conf_file = catfile Apache::server_root_relative($r->pool, 'conf'),
"httpd.conf";
printf "$conf_file is %0.2f minutes old", 60*24*(-M $conf_file);
This registry script is supposed to print when the last time
I<httpd.conf> has been modified, compared to the start of the request
process time. If you run this script several times you might be
surprised that it reports the same value all the time. Unless the
request happens to be served by a recently started child process which
will then report a different value. But most of the time the value
won't be reported correctly.
This happens because the C<-M> operator reports the difference between
file's modification time and the value of a special Perl variable
C<$^T>. When we run scripts from the command line, this variable is
always set to the time when the script gets invoked. Under mod_perl
this variable is getting preset once when the child process starts and
doesn't change since then, so all requests see the same time, when
operators like C<-M>, C<-C> and C<-A> are used.
Armed with this knowledge, in order to make our code behave similarly
to the command line programs we need to reset C<$^T> to the request's
start time, before C<-M> is used. We can change the script itself, but
what if we need to do the same change for several other scripts and
handlers? A simple C<PerlPostReadRequestHandler> handler, which will
be executed as the very first thing of each requests, comes handy
here:
file:MyApache/TimeReset.pm
--------------------------
package MyApache::TimeReset;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::Const -compile => 'OK';
sub handler {
my $r = shift;
$^T = $r->request_time;
return Apache::OK;
}
1;
We could do:
$^T = time();
But to make things more efficient we use C<$r-E<gt>request_time> since
the request object C<$r> already stores the request's start time, so
we get it without performing an additional system call.
To enable it just add to I<httpd.conf>:
PerlPostReadRequestHandler MyApache::TimeReset
either to the global section, or to the C<E<lt>VirtualHostE<gt>>
section if you want this handler to be run only for a specific virtual
host.
=head2 PerlTransHandler
The I<translate> phase is used to perform the translation of a
request's URI into an corresponding filename. If no custom handler is
provided, the server's standard translation rules (e.g., C<Alias>
directives, mod_rewrite, etc.) will continue to be used. A
C<PerlTransHandler> handler can alter the default translation
mechanism or completely override it.
In addition to doing the translation, this stage can be used to modify
the URI itself and the request method. This is also a good place to
register new handlers for the following phases based on the URI.
This phase is of type
C<L<RUN_FIRST|docs::2.0::user::handlers::intro/item_RUN_FIRST>>.
The handler's configuration scope is
C<L<SRV|docs::2.0::user::config::config/item_SRV>>, because at this
phase the request has not yet been associated with a particular
filename or directory.
There are many useful things that can be performed at this
stage. Let's look at the example handler that rewrites request URIs,
similar to what mod_rewrite does. For example, if your web-site was
originally made of static pages, and now you have moved to a dynamic
page generation chances are that you don't want to change the old
URIs, because you don't want to break links for those who link to your
site. If the URI:
http://example.com/news/20021031/09/index.html
is now handled by:
http://example.com/perl/news.pl?date=20021031&id=09&page=index.html
the following handler can do the rewriting work transparent to
I<news.pl>, so you can still use the former URI mapping:
file:MyApache/RewriteURI.pm
---------------------------
package MyApache::RewriteURI;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::Const -compile => qw(DECLINED);
sub handler {
my $r = shift;
my ($date, $id, $page) = $r->uri =~ m|^/news/(\d+)/(\d+)/(.*)|;
$r->uri("/perl/news.pl");
$r->args("date=$date&id=$id&page=$page");
return Apache::DECLINED;
}
1;
The handler matches the URI and assigns a new URI via C<$r-E<gt>uri()>
and the query string via C<$r-E<gt>args()>. It then returns
C<Apache::DECLINED>, so the next translation handler will get invoked,
if more rewrites and translations are needed.
Of course if you need to do a more complicated rewriting, this handler
can be easily adjusted to do so.
To configure this module simply add to I<httpd.conf>:
PerlTransHandler +MyApache::RewriteURI
=head2 PerlHeaderParserHandler
The I<header_parser> phase is the first phase to happen after the
request has been mapped to its C<E<lt>LocationE<gt>> (or an equivalent
container). At this phase the handler can examine the request headers
and to take a special action based on these. For example this phase
can be used to block evil clients targeting certain resources, while
little resources were wasted so far.
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
This phase is very similar to
C<L<PerlPostReadRequestHandler|/PerlPostReadRequestHandler>>, with the
only difference that it's run after the request has been mapped to the
resource. Both phases are useful for doing something once per request,
as early as possible. And usually you can take any
C<L<PerlPostReadRequestHandler|/PerlPostReadRequestHandler>> and turn
it into C<L<PerlHeaderParserHandler|/PerlHeaderParserHandler>> by
simply changing the directive name in I<httpd.conf> and moving it
inside the container where it should be executed. Moreover, because
of this similarity mod_perl provides a special directive
C<L<PerlInitHandler|/PerlInitHandler>> which if found outside resource
containers behaves as
C<L<PerlPostReadRequestHandler|/PerlPostReadRequestHandler>>,
otherwise as C<L<PerlHeaderParserHandler|/PerlHeaderParserHandler>>.
You already know that Apache handles the C<HEAD>, C<GET>, C<POST> and
several other HTTP methods. But did you know that you can invent your
own HTTP method as long as there is a client that supports it. If you
think of emails, they are very similar to HTTP messages: they have a
set of headers and a body, sometimes a multi-part body. Therefore we
can develop a handler that extends HTTP by adding a support for the
C<EMAIL> method. We can enable this protocol extension during and
push the real content handler during the
C<L<PerlHeaderParserHandler|/PerlHeaderParserHandler>> phase:
<Location /email>
PerlHeaderParserHandler MyApache::SendEmail
</Location>
and here is the C<MyApache::SendEmail> handler:
file:MyApache/SendEmail.pm
--------------------------
package MyApache::SendEmail;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::RequestIO ();
use Apache::RequestUtil ();
use Apache::Const -compile => qw(DECLINED OK);
use constant METHOD => 'EMAIL';
use constant SMTP_HOSTNAME => "localhost";
sub handler {
my $r = shift;
return Apache::DECLINED unless $r->method eq METHOD;
Apache::method_register($r->pool, METHOD);
$r->handler("perl-script");
$r->push_handlers(PerlHandler => \&send_email_handler);
return Apache::OK;
}
sub send_email_handler {
my $r = shift;
my %headers = map {$_ => $r->headers_in->get($_)} qw(To From Subject);
my $content = $r->content;
my $status = send_email(\%headers, \$content);
$r->content_type('text/plain');
$r->print($status ? "ACK" : "NACK");
return Apache::OK;
}
sub content {
my $r = shift;
$r->setup_client_block;
return '' unless $r->should_client_block;
my $len = $r->headers_in->get('content-length');
my $buf;
$r->get_client_block($buf, $len);
return $buf;
}
sub send_email {
my($rh_headers, $r_body) = @_;
require MIME::Lite;
MIME::Lite->send("smtp", SMTP_HOSTNAME, Timeout => 60);
my $msg = MIME::Lite->new(%$rh_headers, Data => $$r_body);
#warn $msg->as_string;
$msg->send;
}
1;
Let's get the less interesting code out of the way. The function
content() grabs the request body. The function send_email() sends the
email over SMTP. You should adjust the constant C<SMTP_HOSTNAME> to
point to your outgoing SMTP server. You can replace this function with
your own if you prefer to use a different method to send email.
Now to the more interesting functions. The function C<handler()>
returns immediately and passes the control to the next handler if the
request method is not equal to C<EMAIL> (set in the C<METHOD>
constant):
return Apache::DECLINED unless $r->method eq METHOD;
Next it tells Apache that this new method is a valid one and that the
C<perl-script> handler will do the processing. Finally it pushes the
function C<send_email_handler()> to the C<PerlResponseHandler> list of
handlers:
Apache::method_register($r->pool, METHOD);
$r->handler("perl-script");
$r->push_handlers(PerlResponseHandler => \&send_email_handler);
The function terminates the header_parser phase by:
return Apache::OK;
All other phases run as usual, so you can reuse any HTTP protocol
hooks, such as authentication and fixup phases.
When the response phase starts C<send_email_handler()> is invoked,
assuming that no other response handlers were inserted before it. The
response handler consists of three parts. Retrieve the email headers
C<To>, C<From> and C<Subject>, and the body of the message:
my %headers = map {$_ => $r->headers_in->get($_)} qw(To From Subject);
my $content = $r->content;
Then send the email:
my $status = send_email(\%headers, \$content);
Finally return to the client a simple response acknowledging that
email has been sent and finish the response phase by returning
C<Apache::OK>:
$r->content_type('text/plain');
$r->print($status ? "ACK" : "NACK");
return Apache::OK;
Of course you will want to add extra validations if you want to use
this code in production. This is just a proof of concept
implementation.
As already mentioned when you extend an HTTP protocol you need to have
a client that knows how to use the extension. So here is a simple
client that uses C<LWP::UserAgent> to issue an C<EMAIL> method request
over HTTP protocol:
file:send_http_email.pl
-----------------------
#!/usr/bin/perl
use strict;
use warnings;
require LWP::UserAgent;
my $url = "http://localhost:8000/email/";
my %headers = (
From => '[EMAIL PROTECTED]',
To => '[EMAIL PROTECTED]',
Subject => '3 weeks in Tibet',
);
my $content = <<EOI;
I didn't have an email software,
but could use HTTP so I'm sending it over HTTP
EOI
my $headers = HTTP::Headers->new(%headers);
my $req = HTTP::Request->new("EMAIL", $url, $headers, $content);
my $res = LWP::UserAgent->new->request($req);
print $res->is_success ? $res->content : "failed";
most of the code is just a custom data. The code that does something
consists of four lines at the very end. Create C<HTTP::Headers> and
C<HTTP::Request> object. Issue the request and get the
response. Finally print the response's content if it was successful or
just I<"failed"> if not.
Now save the client code in the file I<send_http_email.pl>, adjust the
I<To> field, make the file executable and execute it, after you have
restarted the server. You should receive an email shortly to the
address set in the I<To> field.
=head2 PerlInitHandler
When configured inside any container directive, except
C<E<lt>VirtualHostE<gt>>, this handler is an alias for
C<L<PerlHeaderParserHandler|/PerlHeaderParserHandler>> described
later. Otherwise it acts as an alias for
C<L<PerlPostReadRequestHandler|/PerlPostReadRequestHandler>> described
earlier.
It is the first handler to be invoked when serving a request.
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The best example here would be to use
C<L<Apache::Reload|docs::2.0::api::mod_perl-2.0::Apache::Reload>>
which takes the benefit of this directive. Usually
C<L<Apache::Reload|docs::2.0::api::mod_perl-2.0::Apache::Reload>> is
configured as:
PerlInitHandler Apache::Reload
PerlSetVar ReloadAll Off
PerlSetVar ReloadModules "MyApache::*"
which will monitor and reload all C<MyApache::*> modules that have
been modified since the last request. However if we move the global
configuration into a C<E<lt>LocationE<gt>> container:
<Location /devel>
PerlInitHandler Apache::Reload
PerlSetVar ReloadAll Off
PerlSetVar ReloadModules "MyApache::*"
SetHandler perl-script
PerlHandler ModPerl::Registry
Options +ExecCGI
</Location>
C<L<Apache::Reload|docs::2.0::api::mod_perl-2.0::Apache::Reload>> will
reload the modified modules, only when a request to the I</devel>
namespace is issued, because C<L<PerlInitHandler|/PerlInitHandler>>
plays the role of
C<L<PerlHeaderParserHandler|/PerlHeaderParserHandler>> here.
=head2 PerlAccessHandler
The I<access_checker> phase is the first of three handlers that are
involved in what's known as AAA: Authentication and Authorization, and
Access control.
This phase can be used to restrict access from a certain IP address,
time of the day or any other rule not connected to the user's
identity.
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
The concept behind access checker handler is very simple, return
C<Apache::FORBIDDEN> if the access is not allowed, otherwise return
C<Apache::OK>.
The following example handler blocks requests made from IPs on the
blacklist.
file:MyApache/BlockByIP.pm
--------------------------
package MyApache::BlockByIP;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::Connection ();
use Apache::Const -compile => qw(FORBIDDEN OK);
my %bad_ips = map {$_ => 1} qw(127.0.0.1 10.0.0.4);
sub handler {
my $r = shift;
return exists $bad_ips{$r->connection->remote_ip}
? Apache::FORBIDDEN
: Apache::OK;
}
1;
The handler retrieves the connection's IP address, looks it up in the
hash of blacklisted IPs and forbids the access if found. If the IP is
not blacklisted, the handler returns control to the next access
checker handler, which may still block the access based on a different
rule.
To enable the handler simply add it to the container that needs to be
protected. For example to protect an access to the registry scripts
executed from the base location I</perl> add:
<Location /perl/>
SetHandler perl-script
PerlResponseHandler ModPerl::Registry
PerlAccessHandler MyApache::BlockByIP
Options +ExecCGI
</Location>
=head2 PerlAuthenHandler
The I<check_user_id> (I<authen>) phase is called whenever the
requested file or directory is password protected. This, in turn,
requires that the directory be associated with C<AuthName>,
C<AuthType> and at least one C<require> directive.
This phase is usually used to verify a user's identification
credentials. If the credentials are verified to be correct, the
handler should return C<OK>. Otherwise the handler returns
C<AUTH_REQUIRED> to indicate that the user has not authenticated
successfully. When Apache sends the HTTP header with this code, the
browser will normally pop up a dialog box that prompts the user for
login information.
This phase is of type
C<L<RUN_FIRST|docs::2.0::user::handlers::intro/item_RUN_FIRST>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
The following handler authenticates users by asking for a username and
a password and lets them in only if the length of a string made from
the supplied username and password and a single space equals to the
secret length, specified by the constant C<SECRET_LENGTH>.
file:MyApache/SecretLengthAuth.pm
---------------------------------
package MyApache::SecretLengthAuth;
use strict;
use warnings;
use Apache::Const -compile => qw(OK DECLINED AUTH_REQUIRED);
use constant SECRET_LENGTH => 14;
sub handler {
my $r = shift;
my ($status, $password) = $r->get_basic_auth_pw;
return $status unless $status == Apache::OK;
return Apache::OK
if SECRET_LENGTH == length join " ", $r->user, $password;
$r->note_basic_auth_failure;
return Apache::AUTH_REQUIRED;
}
1;
First the handler retrieves the status of the authentication and the
password in plain text. The status will be set to C<Apache::OK> only
when the user has supplied the username and the password
credentials. If the status is different, we just let Apache handle
this situation for us, which will usually challenge the client so
it'll supply the credentials.
Once we know that we have the username and the password supplied by
the client, we can proceed with the authentication. Our authentication
algorithm is unusual. Instead of validating the username/password pair
against a password file, we simply check that the string built from
these two items plus a single space is C<SECRET_LENGTH> long (14 in
our example). So for example the pair I<mod_perl/rules> authenticates
correctly, whereas I<secret/password> does not, because the latter
pair will make a string of 15 characters. Of course this is not a
strong authentication scheme and you shouldn't use it for serious
things, but it's fun to play with. Most authentication validations
simply verify the username/password against a database of valid pairs,
usually this requires the password to be encrypted first, since
storing passwords in clear is a bad idea.
Finally if our authentication fails the handler calls
note_basic_auth_failure() and returns C<Apache::AUTH_REQUIRED>, which
sets the proper HTTP response headers that tell the client that its
user that the authentication has failed and the credentials should be
supplied again.
It's not enough to enable this handler for the authentication to
work. You have to tell Apache what authentication scheme to use
(C<Basic> or C<Digest>), which is specified by the C<AuthType>
directive, and you should also supply the C<AuthName> -- the
authentication realm, which is really just a string that the client
usually uses as a title in the pop-up box, where the username and the
password are inserted. Finally the C<Require> directive is needed to
specify which usernames are allowed to authenticate. If you set it to
C<valid-user> any username will do.
Here is the whole configuration section that requires users to
authenticate before they are allowed to run the registry scripts from
I</perl/>:
<Location /perl/>
SetHandler perl-script
PerlResponseHandler ModPerl::Registry
PerlAuthenHandler MyApache::SecretLengthAuth
Options +ExecCGI
AuthType Basic
AuthName "The Gate"
Require valid-user
</Location>
=head2 PerlAuthzHandler
The I<auth_checker> (I<authz>) phase is used for authorization
control. This phase requires a successful authentication from the
previous phase, because a username is needed in order to decide
whether a user is authorized to access the requested resource.
As this phase is tightly connected to the authentication phase, the
handlers registered for this phase are only called when the requested
resource is password protected, similar to the auth phase. The handler
is expected to return C<Apache::DECLINED> to defer the decision,
C<Apache::OK> to indicate its acceptance of the user's authorization,
or C<Apache::AUTH_REQUIRED> to indicate that the user is not
authorized to access the requested document.
This phase is of type
C<L<RUN_FIRST|docs::2.0::user::handlers::intro/item_RUN_FIRST>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
Here is the C<MyApache::SecretResourceAuthz> handler which allows an
access to certain resources only to certain users who have already
properly authenticated:
file:MyApache/SecretResourceAuthz.pm
------------------------------------
package MyApache::SecretResourceAuthz;
use strict;
use warnings;
use Apache::Const -compile => qw(OK AUTH_REQUIRED);
use constant SECRET_LENGTH => 14;
my %protected = (
'admin' => ['stas'],
'report' => [qw(stas boss)],
);
sub handler {
my $r = shift;
my $user = $r->user;
if ($user) {
my($section) = $r->uri =~ m|^/company/(\w+)/|;
if (my $users = $protected{$section}) {
return Apache::OK if grep { $_ eq $user } @$users;
}
else {
return Apache::OK;
}
}
$r->note_basic_auth_failure;
return Apache::AUTH_REQUIRED;
}
1;
This authorization handler is very similar to the authentication
handler L<from the previous section|/PerlAuthenHandler>. Here we rely
on the previous phase to get users authenticated, and now as we have
the username we can make decisions whether to let the user access the
resource it has asked for or not. In our example we have a simple hash
which maps which users are allowed to access what resources. So for
example anything under I</company/admin/> can be accessed only by the
user I<stas>, I</company/report/> can be accessed by users I<stas> and
I<boss>, whereas any other resources under I</company/> can be
accessed by everybody who has reached so far. If for some reason we
don't get the username, we or the user is not authorized to access the
resource the handler does the same thing as it does when the
authentication fails, i.e, calls:
$r->note_basic_auth_failure;
return Apache::AUTH_REQUIRED;
The configuration is similar to the one in L<the previous
section|/PerlAuthenHandler>, this time we just add the
C<PerlAuthzHandler> setting. The rest doesn't change.
Alias /company/ /home/httpd/httpd-2.0/perl/
<Location /company/>
SetHandler perl-script
PerlResponseHandler ModPerl::Registry
PerlAuthenHandler MyApache::SecretPhraseAuth
PerlAuthzHandler MyApache::SecretResourceAuthz
Options +ExecCGI
AuthType Basic
AuthName "The Secret Gate"
Require valid-user
</Location>
=head2 PerlTypeHandler
The I<type_checker> phase is used to set the response MIME type
(C<Content-type>) and sometimes other bits of document type
information like the document language.
For example C<mod_autoindex>, which performs automatic directory
indexing, uses this phase to map the filename extensions to the
corresponding icons which will be later used in the listing of files.
Of course later phases may override the mime type set in this phase.
This phase is of type
C<L<RUN_FIRST|docs::2.0::user::handlers::intro/item_RUN_FIRST>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
The most important thing to remember when overriding the default
I<type_checker> handler, which is usually the mod_mime handler, is
that you have to set the handler that will take care of the response
phase and the response callback function or the code won't
work. mod_mime does that based on C<SetHandler> and C<AddHandler>
directives, and file extensions. So if you want the content handler to
be run by mod_perl, set either:
$r->handler('perl-script');
$r->set_handlers(PerlResponseHandler => \&handler);
or:
$r->handler('modperl');
$r->set_handlers(PerlResponseHandler => \&handler);
depending on which type of response handler is wanted.
Writing a C<PerlTypeHandler> handler which sets the content-type value
and returns C<Apache::DECLINED> so that the default handler will do
the rest of the work, is not a good idea, because mod_mime will
probably override this and other settings.
Therefore it's the easiest to leave this stage alone and do any
desired settings in the I<fixups> phase.
=head2 PerlFixupHandler
The I<fixups> phase is happening just before the content handling
phase. It gives the last chance to do things before the response is
generated. For example in this phase C<mod_env> populates the
environment with variables configured with I<SetEnv> and I<PassEnv>
directives.
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
The following fixup handler example tells Apache at run time which
handler and callback should be used to process the request based on
the file extension of the request's URI.
file:MyApache/FileExtDispatch.pm
--------------------------------
package MyApache::FileExtDispatch;
use strict;
use warnings;
use Apache::Const -compile => 'OK';
use constant HANDLER => 0;
use constant CALLBACK => 1;
my %exts = (
cgi => ['perl-script', \&cgi_handler],
pl => ['modperl', \&pl_handler ],
tt => ['perl-script', \&tt_handler ],
txt => ['default-handler', undef ],
);
sub handler {
my $r = shift;
my ($ext) = $r->uri =~ /\.(\w+)$/;
$ext = 'txt' unless defined $ext and exists $exts{$ext};
$r->handler($exts{$ext}->[HANDLER]);
if (defined $exts{$ext}->[CALLBACK]) {
$r->set_handlers(PerlHandler => $exts{$ext}->[CALLBACK]);
}
return Apache::OK;
}
sub cgi_handler { content_handler($_[0], 'cgi') }
sub pl_handler { content_handler($_[0], 'pl') }
sub tt_handler { content_handler($_[0], 'tt') }
sub content_handler {
my($r, $type) = @_;
$r->content_type('text/plain');
$r->print("A handler of type '$type' was called");
return Apache::OK;
}
1;
In the example we have used the following mapping.
my %exts = (
cgi => ['perl-script', \&cgi_handler],
pl => ['modperl', \&pl_handler ],
tt => ['perl-script', \&tt_handler ],
txt => ['default-handler', undef ],
);
So that I<.cgi> requests will be handled by the C<perl-script> handler
and the C<cgi_handler()> callback, I<.pl> requests by C<modperl> and
C<pl_handler()>, I<.tt> (template toolkit) by C<perl-script> and the
C<tt_handler()>, finally I<.txt> request by the C<default-handler>
handler, which requires no callback.
Moreover the handler assumes that if the request's URI has no file
extension or it does, but it's not in its mapping, the
C<default-handler> will be used, as if the I<txt> extension was used.
After doing the mapping, the handler assigns the handler:
$r->handler($exts{$ext}->[HANDLER]);
and the callback if needed:
if (defined $exts{$ext}->[CALLBACK]) {
$r->set_handlers(PerlHandler => $exts{$ext}->[CALLBACK]);
}
In this simple example the callback functions don't do much but
calling the same content handler which simply prints the name of the
extension if handled by mod_perl, otherwise Apache will serve the
other files using the default handler. In real world you will use
callbacks to real content handlers that do real things.
Here is how this handler is configured:
Alias /dispatch/ /home/httpd/dispatch/
<Location /dispatch/>
PerlFixupHandler MyApache::FileExtDispatch
</Location>
Notice that there is no need to specify anything, but the fixup
handler. It applies the rest of the settings dynamically at run-time.
=head2 PerlResponseHandler
The I<handler> (I<response>) phase is used for generating the
response. This is probably the most important phase and most of the
existing Apache modules do most of their work at this phase.
This is the only phase that requires two directives under
mod_perl. For example:
<Location /perl>
SetHandler perl-script
PerlResponseHandler Apache::Registry
</Location>
C<SetHandler> set to
L<C<perl-script>|docs::2.0::user::config::config/perl_script> or
L<C<modperl>|docs::2.0::user::config::config/modperl> tells Apache
that mod_perl is going to handle the response
generation. C<PerlResponseHandler> tells mod_perl which callback is
going to do the job.
This phase is of type
C<L<RUN_FIRST|docs::2.0::user::handlers::intro/item_RUN_FIRST>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
Most of the C<Apache::> modules on CPAN are dealing with this
phase. In fact most of the developers spend the majority of their time
working on handlers that generate response content.
Let's write a simple response handler, that just generates some
content. This time let's do something more interesting than printing
I<"Hello world">. Let's write a handler that prints itself:
file:MyApache/Deparse.pm
------------------------
package MyApache::Deparse;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::RequestIO ();
use B::Deparse ();
use Apache::Const -compile => 'OK';
sub handler {
my $r = shift;
$r->content_type('text/plain');
$r->print('sub handler ', B::Deparse->new->coderef2text(\&handler));
return Apache::OK;
}
1;
To enable this handler add to I<httpd.conf>:
<Location /deparse>
SetHandler modperl
PerlResponseHandler MyApache::Deparse
</Location>
Now when the server is restarted and we issue a request to
I<http://localhost/deparse> we get the following response:
sub handler {
package MyApache::Deparse;
my $r = shift @_;
$r->content_type('text/plain');
$r->print('sub handler ', 'B::Deparse'->new->coderef2text(\&handler));
return 0;
}
if you compare it to the source code, it's pretty much the
same. C<B::Deparse> is fun to play with!
=head2 PerlLogHandler
The I<log_transaction> phase happens no matter how the previous phases
have ended up. If one of the earlier phases has aborted a request,
e.g., failed authentication or 404 (file not found) errors, the rest of
the phases up to and including the response phases are skipped. But
this phase is always executed.
By this phase all the information about the request and the response
is known, therefore the logging handlers usually record this
information in various ways (e.g., logging to a flat file or a
database).
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The handler's configuration scope is
C<L<DIR|docs::2.0::user::config::config/item_DIR>>.
Imagine a situation where you have to log requests into individual
files, one per user. Assuming that all requests start with
I</users/username/>, so it's easy to categorize requests by the second
URI path component. Here is the log handler that does that:
file:MyApache/LogPerUser.pm
---------------------------
package MyApache::LogPerUser;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::Connection ();
use Fcntl qw(:flock);
use Apache::Const -compile => qw(OK DECLINED);
sub handler {
my $r = shift;
my($username) = $r->uri =~ m|^/users/([^/]+)|;
return Apache::DECLINED unless defined $username;
my $entry = sprintf qq(%s [%s] "%s" %d %d\n),
$r->connection->remote_ip, scalar(localtime),
$r->uri, $r->status, $r->bytes_sent;
my $log_path = Apache::server_root_relative($r->pool,
"logs/$username.log");
open my $fh, ">>$log_path" or die "can't open $log_path: $!";
flock $fh, LOCK_EX;
print $fh $entry;
close $fh;
return Apache::OK;
}
1;
First the handler tries to figure out what username the request is
issued for, if it fails to match the URI, it simply returns
C<Apache::DECLINED>, letting other log handlers to do the
logging. Though it could return C<Apache::OK> since all other log
handlers will be run anyway.
Next it builds the log entry, similar to the default I<access_log>
entry. It's comprised of remote IP, the current time, the uri, the
return status and how many bytes were sent to the client as a response
body.
Finally the handler appends this entry to the log file for the user
the request was issued for. Usually it's safe to append short strings
to the file without being afraid of messing up the file, when two
files attempt to write at the same time, but just to be on the safe
side the handler exclusively locks the file before performing the
writing.
To configure the handler simply enable the module with the
C<PerlLogHandler> directive, inside the wanted section, which was
I</users/> in our example:
<Location /users/>
SetHandler perl-script
PerlResponseHandler ModPerl::Registry
PerlLogHandler MyApache::LogPerUser
Options +ExecCGI
</Location>
After restarting the server and issuing requests to the following
URIs:
http://localhost/users/stas/test.pl
http://localhost/users/eric/test.pl
http://localhost/users/stas/date.pl
The C<MyApache::LogPerUser> handler will append to I<logs/stas.log>:
127.0.0.1 [Sat Aug 31 01:50:38 2002] "/users/stas/test.pl" 200 8
127.0.0.1 [Sat Aug 31 01:50:40 2002] "/users/stas/date.pl" 200 44
and to I<logs/eric.log>:
127.0.0.1 [Sat Aug 31 01:50:39 2002] "/users/eric/test.pl" 200 8
=head2 PerlCleanupHandler
META: not implemented yet
This phase is of type C<XXX>.
The handler's configuration scope is C<XXX>.
=head1 Maintainers
Maintainer is the person(s) you should contact with updates,
corrections and patches.
=over
=item *
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=back
=head1 Authors
=over
=item *
=back
Only the major authors are listed above. For contributors see the
Changes file.
=cut
1.1 modperl-docs/src/docs/2.0/user/handlers/intro.pod
Index: intro.pod
===================================================================
=head1 NAME
Introducing mod_perl Handlers
=head1 Description
This chapter provides an introduction into mod_perl handlers.
=head1 What are Handlers?
Apache distinguishes between numerous phases for which it provides
hooks (because the C functions are called
I<ap_hook_E<lt>phase_nameE<gt>>) where modules can plug various
callbacks to extend and alter the default behavior of the webserver.
mod_perl provides a Perl interface for most of the available hooks, so
mod_perl modules writers can change the Apache behavior in Perl. These
callbacks are usually referred to as I<handlers> and therefore the
configuration directives for the mod_perl handlers look like:
C<PerlFooHandler>, where C<Foo> is one of the handler names. For
example C<PerlResponseHandler> configures the response callback.
A typical handler is simply a perl package with a I<handler>
subroutine. For example:
file:MyApache/CurrentTime.pm
----------------------------
package MyApache::CurrentTime;
use strict;
use warnings;
use Apache::RequestRec ();
use Apache::RequestIO ();
use Apache::Const -compile => qw(OK);
sub handler {
my $r = shift;
$r->content_type('text/plain');
$r->print("Now is: " . scalar(localtime) . "\n");
return Apache::OK;
}
1;
This handler simply returns the current date and time as a
response.
Since this is a response handler, we configure it as a such in
I<httpd.conf>:
PerlResponseHandler MyApache::CurrentTime
Since the response handler should be configured for a specific
location, let's write a complete configuration section:
PerlModule MyApache::CurrentTime
<Location /time>
SetHandler modperl
PerlResponseHandler MyApache::CurrentTime
</Location>
Now when a request is issued to I<http://localhost/time> this response
handler is executed and a response that includes the current time is
returned to the client.
=head1 mod_perl Handlers Categories
The mod_perl handlers can be divided by their application scope in
several categories:
=over
=item * L<Server life cycle|docs::2.0::user::handlers::server/>
=over
=item *
C<L<PerlOpenLogsHandler|docs::2.0::user::handlers::server/PerlOpenLogsHandler>>
=item *
C<L<PerlPostConfigHandler|docs::2.0::user::handlers::server/PerlPostConfigHandler>>
=item *
C<L<PerlChildInitHandler|docs::2.0::user::handlers::server/PerlChildInitHandler>>
=item *
C<L<PerlChildExitHandler|docs::2.0::user::handlers::server/PerlChildExitHandler>>
=back
=item * L<Protocols|docs::2.0::user::handlers::protocols/>
=over
=item *
C<L<PerlPreConnectionHandler|docs::2.0::user::handlers::protocols/PerlPreConnectionHandler>>
=item *
C<L<PerlProcessConnectionHandler|docs::2.0::user::handlers::protocols/PerlProcessConnectionHandler>>
=back
=item * L<Filters|docs::2.0::user::handlers::filters/>
=over
=item *
C<L<PerlInputFilterHandler|docs::2.0::user::handlers::filters/PerlInputFilterHandler>>
=item *
C<L<PerlOutputFilterHandler|docs::2.0::user::handlers::filters/PerlOutputFilterHandler>>
=back
=item * L<HTTP Protocol|docs::2.0::user::handlers::http/>
=over
=item *
C<L<PerlPostReadRequestHandler|docs::2.0::user::handlers::http/PerlPostReadRequestHandler>>
=item *
C<L<PerlTransHandler|docs::2.0::user::handlers::http/PerlTransHandler>>
=item * C<L<PerlInitHandler|docs::2.0::user::handlers::http/PerlInitHandler>>
=item *
C<L<PerlHeaderParserHandler|docs::2.0::user::handlers::http/PerlHeaderParserHandler>>
=item *
C<L<PerlAccessHandler|docs::2.0::user::handlers::http/PerlAccessHandler>>
=item *
C<L<PerlAuthenHandler|docs::2.0::user::handlers::http/PerlAuthenHandler>>
=item *
C<L<PerlAuthzHandler|docs::2.0::user::handlers::http/PerlAuthzHandler>>
=item * C<L<PerlTypeHandler|docs::2.0::user::handlers::http/PerlTypeHandler>>
=item *
C<L<PerlFixupHandler|docs::2.0::user::handlers::http/PerlFixupHandler>>
=item *
C<L<PerlResponseHandler|docs::2.0::user::handlers::http/PerlResponseHandler>>
=item * C<L<PerlLogHandler|docs::2.0::user::handlers::http/PerlLogHandler>>
=item *
C<L<PerlCleanupHandler|docs::2.0::user::handlers::http/PerlCleanupHandler>>
=back
=back
=head1 Bucket Brigades
Apache 2.0 allows multiple modules to filter both the request and the
response. Now one module can pipe its output as an input to another
module as if another module was receiving the data directly from the
TCP stream. The same mechanism works with the generated response.
With I/O filtering in place, simple filters, like data compression and
decompression, can be easily implemented and complex filters, like
SSL, are now possible without needing to modify the the server code
which was the case with Apache 1.3.
In order to make the filtering mechanism efficient and avoid
unnecessary copying, the I<Bucket Brigades> technology was introduced.
A bucket represents a chunk of data. Buckets linked together comprise
a brigade. Each bucket in a brigade can be modified, removed and
replaced with another bucket. The goal is to minimize the data copying
where possible. Buckets come in different types, such as files, data
blocks, end of stream indicators, pools, etc. To manipulate a bucket
one doesn't need to know its internal representation.
The stream of data is represented by bucket brigades. When a filter
is called it gets passed the brigade that was the output of the
previous filter. This brigade is then manipulated by the filter (e.g.,
by modifying some buckets) and passed to the next filter in the stack.
The following figure depicts an imaginary bucket brigade:
=for html
<img src="bucket_brigades.gif" width="590" height="400"
align="center" valign="middle" alt="bucket brigades"><br><br>
The figure tries to show that after the presented bucket brigade has
passed through several filters some buckets were removed, some
modified and some added. Of course the handler that gets the brigade
cannot tell the history of the brigade, it can only see the existing
buckets in the brigade.
Bucket brigades are discussed in detail in the L<connection
protocols|docs::2.0::user::handler::protocols> and L<I/O
filtering|docs::2.0::user::handler::filters> chapters.
=head1 Single Phase's Multiple Handlers Behavior
For each phase there can be more than one handler assigned (also known
as I<hooks>, because the C functions are called
I<ap_hook_E<lt>phase_nameE<gt>>). Phases' behavior varies when there
is more then one handler registered to run for the same phase. The
following table specifies each handler's behavior in this situation:
Directive Type
--------------------------------------
PerlOpenLogsHandler RUN_ALL
PerlPostConfigHandler RUN_ALL
PerlChildInitHandler VOID
PerlChildExitHandler XXX
PerlPreConnectionHandler RUN_ALL
PerlProcessConnectionHandler RUN_FIRST
PerlPostReadRequestHandler RUN_ALL
PerlTransHandler RUN_FIRST
PerlInitHandler RUN_ALL
PerlHeaderParserHandler RUN_ALL
PerlAccessHandler RUN_ALL
PerlAuthenHandler RUN_FIRST
PerlAuthzHandler RUN_FIRST
PerlTypeHandler RUN_FIRST
PerlFixupHandler RUN_ALL
PerlResponseHandler RUN_FIRST
PerlLogHandler RUN_ALL
PerlCleanupHandler XXX
PerlInputFilterHandler VOID
PerlOutputFilterHandler VOID
And here is the description of the possible types:
=over
=item * VOID
Handlers of the type C<VOID> will be I<all> executed in the order they
have been registered disregarding their return values. Though in
mod_perl they are expected to return C<Apache::OK>.
=item * RUN_FIRST
Handlers of the type C<RUN_FIRST> will be executed in the order they
have been registered until the first handler that returns something
other than C<Apache::DECLINED>. If the return value is
C<Apache::DECLINED>, the next handler in the chain will be run. If the
return value is C<Apache::OK> the next phase will start. In all other
cases the execution will be aborted.
=item * RUN_ALL
Handlers of the type C<RUN_ALL> will be executed in the order they
have been registered until the first handler that returns something
other than C<Apache::OK> or C<Apache::DECLINED>.
=back
For C API declarations see I<include/ap_config.h>, which includes
other types which aren't exposed by mod_perl handlers.
Also see L<mod_perl Directives Argument Types and Allowed
Location|docs::2.0::user::config::config/mod_perl_Directives_Argument_Types_and_Allowed_Location>
=head1 Hook Ordering (Position)
The following constants specify how the new hooks (handlers) are
inserted into the list of hooks when there is at least one hook
already registered for the same phase.
META: need to verify the following:
=over
=item * C<APR::HOOK_REALLY_FIRST>
run this hook first, before ANYTHING.
=item * C<APR::HOOK_FIRST>
run this hook first.
=item * C<APR::HOOK_MIDDLE>
run this hook somewhere.
=item * C<APR::HOOK_LAST>
run this hook after every other hook which is defined.
=item * C<APR::HOOK_REALLY_LAST>
run this hook last, after EVERYTHING.
=back
META: more information in mod_example.c talking about
position/predecessors, etc.
=head1 Maintainers
Maintainer is the person(s) you should contact with updates,
corrections and patches.
=over
=item *
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=back
=head1 Authors
=over
=item *
=back
Only the major authors are listed above. For contributors see the
Changes file.
=cut
1.1 modperl-docs/src/docs/2.0/user/handlers/protocols.pod
Index: protocols.pod
===================================================================
=head1 NAME
Protocol Handlers
=head1 Description
This chapter explains how to implement Protocol (Connection) Handlers
in mod_perl.
=head1 Connection Cycle Phases
As we saw earlier, each child server (be it a thread or a process) is
engaged in processing connections. Each connection may be served by
different connection protocols, e.g., HTTP, POP3, SMTP, etc. Each
connection may include more then one request, e.g., several HTTP
requests can be served over a single connection, when a response
includes several images.
The following diagram depicts the connection life cycle and highlights
which handlers are available to mod_perl 2.0:
=for html
<img src="connection_cycle.gif" width="598" height="498"
align="center" valign="middle" alt="connection cycle"><br><br>
When a connection is issued by a client, it's first run through
C<PerlPreConnectionHandler> and then passed to the
C<PerlProcessConnectionHandler>, which generates the response. When
C<PerlProcessConnectionHandler> is reading data from the client, it
can be filtered by connection input filters. The generated response
can be also filtered though connection output filters. Filters are
usually used for modifying the data flowing though them, but can be
used for other purposes as well (e.g., logging interesting
information).
Now let's discuss each of the C<PerlPreConnectionHandler> and
C<PerlProcessConnectionHandler> handlers in detail.
=head2 PerlPreConnectionHandler
The I<pre_connection> phase happens just after the server accepts the
connection, but before it is handed off to a protocol module to be
served. It gives modules an opportunity to modify the connection as
soon as possible and insert filters if needed. The core server uses
this phase to setup the connection record based on the type of
connection that is being used. mod_perl itself uses this phase to
register the connection input and output filters.
In mod_perl 1.0 during code development C<Apache::Reload> was used to
automatically reload modified since the last request Perl modules. It
was invoked during C<post_read_request>, the first HTTP request's
phase. In mod_perl 2.0 I<pre_connection> is the earliest phase, so if
we want to make sure that all modified Perl modules are reloaded for
any protocols and its phases, it's the best to set the scope of the
Perl interpreter to the lifetime of the connection via:
PerlInterpScope connection
and invoke the C<Apache::Reload> handler during the I<pre_connection>
phase. However this development-time advantage can become a
disadvantage in production--for example if a connection, handled by
HTTP protocol, is configured as C<KeepAlive> and there are several
requests coming on the same connection and only one handled by
mod_perl and the others by the default images handler, the Perl
interpreter won't be available to other threads while the images are
being served.
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The handler's configuration scope is
C<L<SRV|docs::2.0::user::config::config/item_SRV>>, because it's not
known yet which resource the request will be mapped to.
XXX: As of this moment C<PerlPreConnectionHandler> is not being
executed by mod_perl. Stay tuned.
Example:
A I<pre_connection> handler accepts connection record and socket
objects as its arguments:
sub handler {
my ($c, $socket) = @_;
# ...
return Apache::OK;
}
=head2 PerlProcessConnectionHandler
The I<process_connection> phase is used to process incoming
connections. Only protocol modules should assign handlers for this
phase, as it gives them an opportunity to replace the standard HTTP
processing with processing for some other protocols (e.g., POP3, FTP,
etc.).
This phase is of type
C<L<RUN_FIRST|docs::2.0::user::handlers::intro/item_RUN_FIRST>>.
The handler's configuration scope is
C<L<SRV|docs::2.0::user::config::config/item_SRV>>. Therefore the only
way to run protocol servers different than the core HTTP is inside
dedicated virtual hosts.
A I<process_connection> handler accepts a connection record object as
its only argument, a socket object can be retrieved from the
connection record object.
sub handler {
my ($c) = @_;
my $socket = $c->client_socket;
# ...
return Apache::OK;
}
Now let's look at the following two examples of connection
handlers. The first using the connection socket to read and write the
data and the second using bucket brigades to accomplish the same and
allow for connection filters to do their work.
=head3 Socket-based Protocol Module
To demonstrate the workings of a protocol module, we'll take a look at
the C<MyApache::EchoSocket> module, which simply echoes the data read
back to the client. In this module we will use the implementation that
works directly with the connection socket and therefore bypasses
connection filters if any.
A protocol handler is configured using the
C<PerlProcessConnectionHandler> directive and we will use the
C<Listen> and C<E<lt>VirtualHostE<gt>> directives to bind to the
non-standard port B<8010>:
Listen 8010
<VirtualHost _default_:8010>
PerlModule MyApache::EchoSocket
PerlProcessConnectionHandler MyApache::EchoSocket
</VirtualHost>
C<MyApache::EchoSocket> is then enabled when starting Apache:
panic% httpd
And we give it a whirl:
panic% telnet localhost 8010
Trying 127.0.0.1...
Connected to localhost (127.0.0.1).
Escape character is '^]'.
Hello
Hello
fOo BaR
fOo BaR
Connection closed by foreign host.
Here is the code:
file:MyApache/EchoSocket.pm
------------------
package MyApache::EchoSocket;
use strict;
use warnings FATAL => 'all';
use Apache::Connection ();
use APR::Socket ();
use Apache::Const -compile => 'OK';
use constant BUFF_LEN => 1024;
sub handler {
my $c = shift;
my $socket = $c->client_socket;
my $buff;
while (1) {
my($rlen, $wlen);
$rlen = BUFF_LEN;
$socket->recv($buff, $rlen);
last if $rlen <= 0 or $buff =~ /^[\r\n]+$/;
$wlen = $rlen;
$socket->send($buff, $wlen);
last if $wlen != $rlen;
}
Apache::OK;
}
1;
The example handler starts with the standard I<package> declaration
and of course, C<use strict;>. As with all C<Perl*Handler>s, the
subroutine name defaults to I<handler>. However, in the case of a
protocol handler, the first argument is not a C<request_rec>, but a
C<conn_rec> blessed into the C<Apache::Connection> class. We have
direct access to the client socket via C<Apache::Connection>'s
I<client_socket> method. This returns an object blessed into the
C<APR::Socket> class.
Inside the read/send loop, the handler attempts to read C<BUFF_LEN>
bytes from the client socket into the C<$buff> buffer. The C<$rlen>
parameter will be set to the number of bytes actually read. The
C<APR::Socket::recv()> method returns an APR status value, be we need
only check the read length to break out of the loop if it is less than
or equal to C<0> bytes. The handler also breaks the loop after
processing an input including nothing but new lines characters, which
is how we abort the connection in the interactive mode.
If the handler receives some data, it sends it unmodified back to the
client with the C<APR::Socket::send()> method. When the loop is
finished the handler returns C<Apache::OK>, telling Apache to
terminate the connection. As mentioned earlier since this handler is
working directly with the connection socket, no filters can be
applied.
=head3 Bucket Brigades-based Protocol Module
Now let's look at the same module, but this time implemented by
manipulating bucket brigades, and which runs its output through a
connection output filter that turns all uppercase characters into
their lowercase equivalents.
The following configuration defines a virtual host listening on port
8011 and which enables the C<MyApache::EchoBB> connection handler, which
will run its output through C<MyApache::EchoBB::lowercase_filter> filter:
Listen 8011
<VirtualHost _default_:8011>
PerlModule MyApache::EchoBB
PerlProcessConnectionHandler MyApache::EchoBB
PerlOutputFilterHandler MyApache::EchoBB::lowercase_filter
</VirtualHost>
As before we start the httpd server:
panic% httpd
And try the new connection handler in action:
panic% telnet localhost 8011
Trying 127.0.0.1...
Connected to localhost (127.0.0.1).
Escape character is '^]'.
Hello
hello
fOo BaR
foo bar
Connection closed by foreign host.
As you can see the response which is now was all in lower case,
because of the output filter. And here is the implementation of the
connection and the filter handlers.
file:MyApache/EchoBB.pm
-----------------------
package MyApache::EchoBB;
use strict;
use warnings FATAL => 'all';
use Apache::Connection ();
use APR::Bucket ();
use APR::Brigade ();
use APR::Util ();
use APR::Const -compile => qw(SUCCESS EOF);
use Apache::Const -compile => qw(OK MODE_GETLINE);
sub handler {
my $c = shift;
my $bb_in = APR::Brigade->new($c->pool, $c->bucket_alloc);
my $bb_out = APR::Brigade->new($c->pool, $c->bucket_alloc);
my $last = 0;
while (1) {
my $rv = $c->input_filters->get_brigade($bb_in,
Apache::MODE_GETLINE);
if ($rv != APR::SUCCESS or $bb_in->empty) {
my $error = APR::strerror($rv);
unless ($rv == APR::EOF) {
warn "get_brigade: $error\n";
}
$bb_in->destroy;
last;
}
while (!$bb_in->empty) {
my $bucket = $bb_in->first;
$bucket->remove;
if ($bucket->is_eos) {
$bb_out->insert_tail($bucket);
last;
}
my $data;
my $status = $bucket->read($data);
return $status unless $status == APR::SUCCESS;
if ($data) {
$last++ if $data =~ /^[\r\n]+$/;
# could do something with the data here
$bucket = APR::Bucket->new($data);
}
$bb_out->insert_tail($bucket);
}
my $b = APR::Bucket::flush_create($c->bucket_alloc);
$bb_out->insert_tail($b);
$c->output_filters->pass_brigade($bb_out);
last if $last;
}
Apache::OK;
}
use base qw(Apache::Filter);
use constant BUFF_LEN => 1024;
sub lowercase_filter : FilterConnectionHandler {
my $filter = shift;
while ($filter->read(my $buffer, BUFF_LEN)) {
$filter->print(lc $buffer);
}
return Apache::OK;
}
1;
For the purpose of explaining how this connection handler works, we
are going to simplify the handler. The whole handler can be
represented by the following pseudo-code:
while ($bb_in = get_brigade()) {
while ($bucket_in = $bb_in->get_bucket()) {
my $data = $bucket_in->read();
# do something with data
$bucket_out = new_bucket($data);
$bb_out->insert_tail($bucket_out);
}
$bb_out->insert_tail($flush_bucket);
pass_brigade($bb_out);
}
The handler receives the incoming data via bucket bridges, one at a
time in a loop. It then process each bridge, by retrieving the
buckets contained in it, reading the data in, then creating new
buckets using the received data, and attaching them to the outgoing
brigade. When all the buckets from the incoming bucket brigade were
transformed and attached to the outgoing bucket brigade, a flush
bucket is created and added as the last bucket, so when the outgoing
bucket brigade is passed out to the outgoing connection filters, it
won't be buffered but sent to the client right away.
If you look at the complete handler, the loop is terminated when one
of the following conditions occurs: an error happens, the end of
stream bucket has been seen (no more input at the connection) or when
the received data contains nothing but new line characters which we
used to to tell the server to terminate the connection.
Notice that this handler could be much simpler, since we don't modify
the data. We could simply pass the whole brigade unmodified without
even looking at the buckets. But from this example you can see how to
write a connection handler where you actually want to read and/or
modify the data. To accomplish that modification simply add a code
that transforms the data which has been read from the bucket before
it's inserted to the outgoing brigade.
We will skip the filter discussion here, since we are going to talk in
depth about filters in the dedicated to filters sections. But all you
need to know at this stage is that the data sent from the connection
handler is filtered by the outgoing filter and which transforms it to
be all lowercase.
=head1 Maintainers
Maintainer is the person(s) you should contact with updates,
corrections and patches.
=over
=item *
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=back
=head1 Authors
=over
=item *
=back
Only the major authors are listed above. For contributors see the
Changes file.
=cut
1.1 modperl-docs/src/docs/2.0/user/handlers/server.pod
Index: server.pod
===================================================================
=head1 NAME
Server Life Cycle Handlers
=head1 Description
This chapter discusses server life cycle and the mod_perl handlers
participating in it.
=head1 Server Life Cycle
The following diagram depicts the Apache 2.0 server life cycle and
highlights which handlers are available to mod_perl 2.0:
=for html
<img src="server_life_cycle.gif" width="561" height="537"
align="center" valign="middle" alt="server life cycle"><br><br>
Apache 2.0 starts by parsing the configuration file. After the
configuration file is parsed, the C<PerlOpenLogsHandler> handlers are
executed if any. After that it's a turn of C<PerlPostConfigHandler>
handlers to be run. When the I<post_config> phase is finished the
server immediately restarts, to make sure that it can survive graceful
restarts after starting to serve the clients.
When the restart is completed, Apache 2.0 spawns the workers that will
do the actual work. Depending on the used MPM, these can be threads,
processes and a mixture of both. For example the I<worker> MPM spawns
a number of processes, each running a number of threads. When each
child process is started C<PerlChildInit> handlers are
executed. Notice that they are run for each starting process, not a
thread.
From that moment on each working thread processes connections until
it's killed by the server or the server is shutdown.
=head2 Startup Phases Demonstration Module
Let's look at the following example that demonstrates all the startup
phases:
file:MyApache/StartupLog.pm
---------------------------
package MyApache::StartupLog;
use strict;
use warnings;
use Apache::Log ();
use File::Spec::Functions;
use Apache::Const -compile => 'OK';
my $log_file = catfile "logs", "startup_log";
my $log_fh;
sub open_logs {
my($conf_pool, $log_pool, $temp_pool, $s) = @_;
my $log_path = Apache::server_root_relative($conf_pool, $log_file);
$s->warn("opening the log file: $log_path");
open $log_fh, ">>$log_path" or die "can't open $log_path: $!";
my $oldfh = select($log_fh); $| = 1; select($oldfh);
say("process $$ is born to reproduce");
return Apache::OK;
}
sub post_config {
my($conf_pool, $log_pool, $temp_pool, $s) = @_;
say("configuration is completed");
return Apache::OK;
}
sub child_init {
my($child_pool, $s) = @_;
say("process $$ is born to serve");
return Apache::OK;
}
sub say {
my($caller) = (caller(1))[3] =~ /([^:]+)$/;
printf $log_fh "[%s] - %-11s: %s\n", scalar(localtime), $caller, $_[0];
}
END {
say("process $$ is shutdown\n");
}
1;
And the I<httpd.conf> configuration section:
PerlModule MyApache::StartupLog
PerlOpenLogsHandler MyApache::StartupLog::open_logs
PerlPostConfigHandler MyApache::StartupLog::post_config
PerlChildInitHandler MyApache::StartupLog::child_init
When we perform a server startup followed by a shutdown, the
I<logs/startup_log> is created if it didn't exist already (it shares
the same directory with I<error_log> and other standard log files),
and each stage appends to it its log information. So when we perform:
% bin/apachectl start && bin/apachectl stop
the following is getting logged to I<logs/startup_log>:
[Thu Aug 22 15:57:08 2002] - open_logs : process 21823 is born to reproduce
[Thu Aug 22 15:57:08 2002] - post_config: configuration is completed
[Thu Aug 22 15:57:09 2002] - END : process 21823 is shutdown
[Thu Aug 22 15:57:10 2002] - open_logs : process 21825 is born to reproduce
[Thu Aug 22 15:57:10 2002] - post_config: configuration is completed
[Thu Aug 22 15:57:11 2002] - child_init : process 21830 is born to serve
[Thu Aug 22 15:57:11 2002] - child_init : process 21831 is born to serve
[Thu Aug 22 15:57:11 2002] - child_init : process 21832 is born to serve
[Thu Aug 22 15:57:11 2002] - child_init : process 21833 is born to serve
[Thu Aug 22 15:57:12 2002] - END : process 21825 is shutdown
First of all, we can clearly see that Apache always restart itself
after the first I<post_config> phase is over. The logs show that the
I<post_config> phase is preceded by the I<open_logs> phase. Only
after Apache has restarted itself and has completed the I<open_logs>
and I<post_config> phase again the I<child_init> phase is run for each
child process. In our example we have had the setting
C<StartServers=4>, therefore you can see four child processes were
started.
Finally you can see that on server shutdown the END {} block has been
executed by the parent server only.
Apache also specifies the I<pre_config> phase, which is executed
before the configuration files are parsed, but this is of no use to
mod_perl, because mod_perl is loaded only during the configuration
phase.
Now let's discuss each of the mentioned startup handlers and their
implementation in the C<MyApache::StartupLog> module in detail.
=head2 PerlOpenLogsHandler
The I<open_logs> phase happens just before the I<post_config> phase.
Handlers registered by C<PerlOpenLogsHandler> are usually used for
opening module-specific log files.
At this stage the C<STDERR> stream is not yet redirected to
I<error_log>, and therefore any messages to that stream will be
printed to the console the server is starting from (if such exists).
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The handler's configuration scope is
C<L<SRV|docs::2.0::user::config::config/item_SRV>>.
As we have seen in the C<MyApache::StartupLog::open_logs> handler, the
I<open_logs> phase handlers accept four arguments: the configuration
pool, the logging streams pool, the temporary pool and the server
object:
sub open_logs {
my($conf_pool, $log_pool, $temp_pool, $s) = @_;
my $log_path = Apache::server_root_relative($conf_pool, $log_file);
$s->warn("opening the log file: $log_path");
open $log_fh, ">>$log_path" or die "can't open $log_path: $!";
my $oldfh = select($log_fh); $| = 1; select($oldfh);
say("process $$ is born to reproduce");
return Apache::OK;
}
In our example the handler uses the function
C<Apache::server_root_relative()> to set the full path to the log
file, which is then opened for appending and set to unbuffered
mode. Finally it logs the fact that it's running in the parent
process.
As you've seen in the example this handler is configured by adding to
I<httpd.conf>:
PerlOpenLogsHandler MyApache::StartupLog::open_logs
=head2 PerlPostConfigHandler
The I<post_config> phase happens right after Apache has processed the
configuration files, before any child processes were spawned (which
happens at the I<child_init> phase).
This phase can be used for initializing things to be shared between
all child processes. You can do the same in the startup file, but in
the I<post_config> phase you have an access to a complete
configuration tree.
META: once mod_perl will have the API for that.
This phase is of type
C<L<RUN_ALL|docs::2.0::user::handlers::intro/item_RUN_ALL>>.
The handler's configuration scope is
C<L<SRV|docs::2.0::user::config::config/item_SRV>>.
In our C<MyApache::StartupLog> example we used the I<post_config()>
handler:
sub post_config {
my($conf_pool, $log_pool, $temp_pool, $s) = @_;
say("configuration is completed");
return Apache::OK;
}
As you can see, its arguments are identical to the I<open_logs>
phase's handler. In this example handler we don't do much but logging
that the configuration was completed and returning right away.
As you've seen in the example this handler is configured by adding to
I<httpd.conf>:
PerlOpenLogsHandler MyApache::StartupLog::post_config
=head2 PerlChildInitHandler
The I<child_init> phase happens immediately after the child process is
spawned. Each child process (not a thread!) will run the hooks of this
phase only once in their life-time.
In the prefork MPM this phase is useful for initializing any data
structures which should be private to each process. For example
C<Apache::DBI> pre-opens database connections during this phase and
C<Apache::Resource> sets the process' resources limits.
This phase is of type
C<L<VOID|docs::2.0::user::handlers::intro/item_VOID>>.
The handler's configuration scope is
C<L<SRV|docs::2.0::user::config::config/item_SRV>>.
In our C<MyApache::StartupLog> example we used the I<child_init()>
handler:
sub child_init {
my($child_pool, $s) = @_;
say("process $$ is born to serve");
return Apache::OK;
}
The I<child_init()> handler accepts two arguments: the child process
pool and the server object. The example handler logs the pid of the
child process it's run in and returns.
As you've seen in the example this handler is configured by adding to
I<httpd.conf>:
PerlOpenLogsHandler MyApache::StartupLog::child_init
=head2 PerlChildExitHandler
META: not implemented yet
=head1 Maintainers
Maintainer is the person(s) you should contact with updates,
corrections and patches.
=over
=item *
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=back
=head1 Authors
=over
=item *
=back
Only the major authors are listed above. For contributors see the
Changes file.
=cut
1.28 +1 -1 modperl-docs/src/docs/2.0/user/install/install.pod
Index: install.pod
===================================================================
RCS file: /home/cvs/modperl-docs/src/docs/2.0/user/install/install.pod,v
retrieving revision 1.27
retrieving revision 1.28
diff -u -r1.27 -r1.28
--- install.pod 25 Aug 2002 16:20:52 -0000 1.27
+++ install.pod 2 Sep 2002 06:34:51 -0000 1.28
@@ -4,7 +4,7 @@
=head1 Description
-This chapter provides an indepth mod_perl 2.0 installation coverage.
+This chapter provides an in-depth mod_perl 2.0 installation coverage.
=head1 Prerequisites
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]