Daniel J. Popowich wrote:
Jorey Bump writes:
Gregory (Grisha) Trubetskoy wrote:
Perhaps we can add something to the docs that says "this attribute gets
its data from the argument to the HTTP GET method, which is usually just
the path in the URL and does not include the protocol, hostname and
port. It is only filled in completely when the server is used as a
proxy"..?
How about: "This attribute gets its data from the client-supplied
Request-URI."
I'd prefer something more explicit (because I'm dense and need 2x4s
about the head). I humbly offer the following to the editorial board:
unparsed_uri
String. The URI without any parsing performed. This is the
argument passed to, e.g., the HTTP GET method, and so is
completely dependent on the value submitted by the client; you
have been warned. Clients typically send a partial uri containing
only the path and query with no hostinfo, e.g.:
"GET /path/to/handler?query=value HTTP/1.1". (Read-Only)
parsed_uri
Tuple. The value of unparsed_uri broken down into pieces. (scheme,
hostinfo, user, password, hostname, port, path, query,
fragment). The apache module defines a set of URI_* constants that
should be used to access elements of this tuple. Example:
fname = req.parsed_uri[apache.URI_PATH]
Please note: as stated for unparsed_uri, the value is completely
dependent on the uri submitted by the client. Since it is typical
for clients to only submit the path and query components the rest
of the elements in the tuple will often be None. This is not a
bug. (Read-Only)
args
String. Same as parsed_uri[apache.URI_QUERY] (and CGI
QUERY_ARGS). (Read-Only)
uri
String. The path portion of the URI. Same as
parsed_uri[apache.URI_PATH]. (Read-Only)
hostname
String. Host, as set by a full URI from, e.g., the HTTP GET
method, or in absence of a full URI, the value of the Host header.
In either case, the value is provided by the client; you have been
warned. Note: when set by the Host header (which is typical) this
value will differ from parsed_uri[apache.URI_HOSTNAME] (which will
be None). See unparsed_uri and parsed_uri. Also, in rare cases
(no full URI, no Host header) this value can be None. (Read-Only)
+1 on your definitions, but I have another issue, related to this thread...
This discussion leads me to believe that req.hostname, in its current
implementation, is hopelessly ambiguous. It is already doing what we've
concluded in this thread to be a Bad Thing(TM) by automagically
interposing two completely unrelated values simply to avoid returning None.
Can anyone conceive of a use case where it would be alright to rely on
this value, even when it's been arbitrarily populated by a
client-supplied absoluteURI (via a proxy, for example)? What would a
developer expect to be contained in this value? For myself, I would
prefer it to be a high-level interface to req.headers_in['Host'], in
which case, None would be somewhat meaningful.
Even better, deprecate req.hostname in 3.2, where we can add req.host to
contain the value in req.headers_in['Host']. Then drop req.hostname in
3.3 completely. This will give developers some time to adapt.
Finally, I'm getting the impression that most developers are looking for
a portable way to get the ServerName, as defined in the Apache
configuration. This may currently be achieved in a variety of ways,
including:
servername = req.server.server_hostname
or:
req.add_common_vars()
servername = req.subprocess_env['SERVER_NAME']
So, getting back to Nicolas' original post, and reaffirming Grisha's
point that req.hostname isn't appropriate in his script, maybe
req.server.server_hostname will work, in that it allows one to construct
an URL that gets the user back to the site, even if it doesn't exactly
match the URL displayed in the browser during the original request.
Does the fact that this is a difficult discovery warrant the addition of
another high-level attribute, req.servername?