On Tue, 29 Nov 2005, Nicolas Lehuen wrote:
def current_url(req):
[snip]
# host
current_url.append(req.hostname)
[snip]
This part isn't going to work reliably if you are not using virtual hosts
and just bind to an IP number. Deciphering the URL is an impossible task -
I used to have similar code in my apllications, but lately I realized that
it does not work reliably and it is much simpler to just treat it as a
configuration item...
First question, is there a simpler way to do this ? Ironically, when using
mod_rewrite, you get an environment variable named SCRIPT_URI which is
precisely what I need (SCRIPT_URL, also added by mod_rewrite, is equivalent
to req.uri... Don't ask we why). But relying on it isn't safe since
mod_rewrite isn't always used.
well - here's how it does it.
/*
* create the SCRIPT_URI variable for the env
*/
/* add the canonical URI of this URL */
thisserver = ap_get_server_name(r);
port = ap_get_server_port(r);
if (ap_is_default_port(port, r)) {
thisport = "";
}
else {
apr_snprintf(buf, sizeof(buf), ":%u", port);
thisport = buf;
}
thisurl = apr_table_get(r->subprocess_env, ENVVAR_SCRIPT_URL);
/* set the variable */
var = apr_pstrcat(r->pool, ap_http_method(r), "://", thisserver, thisport,
thisurl, NULL);
apr_table_setn(r->subprocess_env, ENVVAR_SCRIPT_URI, var);
/* if filename was not initially set,
* we start with the requested URI
*/
if (r->filename == NULL) {
r->filename = apr_pstrdup(r->pool, r->uri);
rewritelog(r, 2, "init rewrite engine with requested uri %s",
r->filename);
}
Second question, if there isn't any simpler way to do this, should we add it
to mod_python ? Either as a function like above in mod_python.util, or as a
member of the request object (named something like url to match the other
member named uri, but that's just teasing).
I don't know... Since the result is going to be half-baked... I think a
more interesting and mod_python-ish thing to do would be to expose all the
API's used in the above code (e.g. ap_get_server_name, ap_is_default_port,
ap_http_method) FIRST, then think about this.
And third question (in pure Spanish inquisition style) : why is
req.parsed_uri returning me a tuple full of Nones except for the uri and
path_info part ?
This is an httpd question most likely...
Ah, fourth question : why are we (mod_python, mod_rewrite and the CGI
environment variables) using the terms "URI" and "URL" to distinguish
between a full, absolute resource path and a path relative to the server,
whereas the definition of URLs and URIs is very vague and nothing close to
this (http://www.w3.org/TR/uri-clarification/#contemporary) ? Shouldn't we
save our souls and a lot of saliva by choosing better names ?
No, we (mod_python) should just use the exact same name that httpd uses.
If we come up better names, then it's just going to make it even more
confusing.
OK, OK, fifth question : we made req.filename and other members writable.
But when those attributes are changed, as Graham noted a while ago, the
other dependent ones aren't, leading to inconsitencies (for example, if you
change req.filename, req.canonical_filename isn't changed). Should we try to
solve this
The solutions is to make req.canonical_filename writable too and document
that if you change req.filename, you may consider changing
canonical_filename as well and what will happen if you do not.
and provide clear definition of the various parts of a request
for mod_python 3.3 ?
Yes, that'd be good :)
Grisha