We use a TransHandler to (among other things) manage name-based virtual
hosts (simply put, given the incoming Host: header plus URI, map to a file).
We (of course) sanitize the incoming URI and Host. It works fine.
I "save" the sanitized hostname like so:
$r->header_in('Host',$host);
$r->subprocess_env('SERVER_NAME',$host);
$r->parsed_uri->hostname($host);
I used to use just the first line, but I added the other two thinking
they might fix our problems...
First problem (somewhat minor):
$ENV{'SERVER_NAME'} remains "unsanitized" (i.e., it's still exactly what
the client sent in the "Host:" header). This is not a big deal because the
sanitized host gets set properly in $ENV{'HTTP_HOST'}. Scripts can just
use that variable instead.
Second problem (bigger):
For logging, we use CLF with the virtual host name tacked on the front
of the line (using %V in the LogConfig). Yes we have "UseCanonicalName On"
and I've read http://apache.org/docs/mod/core.html#usecanonicalname so I
know that %V and SERVER_NAME get set to whatever the client sends.
(and I can't turn it off, because then %V is always ServerName, and
suddenly no "virtual hosts").
I experimented by putting the host sanitation in a PostReadRequestHandler.
Same results. I thought this phase was "...where you can examine HTTP
headers and change them before Apache gets a crack at them" (TPJ #9, p.6).
Here's the relevant bit of Apache code (1.3.9) in http_core.c in the
ap_get_server_name() function:
if (d->use_canonical_name == USE_CANONICAL_NAME_OFF) {
return r->hostname ? r->hostname : r->server->server_hostname;
}
There's no $r->hostname method in mod_perl that I can find, and
unfortunately $r->server->server_hostname is read-only.
I can only think of a couple options: hack http_core.c to do what I want,
or write a custom LogHandler that uses the sanitized host.
Is there any other way?
PS: I'd still like to hear from anyone who is running mod_perl on
Solaris 2.5.1 with Perl 5.005_03 -- I don't want to stick with 5.004 forever.