Daniel J. Popowich wrote:
Jorey Bump writes:

Gregory (Grisha) Trubetskoy wrote:


Perhaps we can add something to the docs that says "this attribute gets its data from the argument to the HTTP GET method, which is usually just the path in the URL and does not include the protocol, hostname and port. It is only filled in completely when the server is used as a proxy"..?

How about: "This attribute gets its data from the client-supplied Request-URI."



I'd prefer something more explicit (because I'm dense and need 2x4s
about the head).  I humbly offer the following to the editorial board:

unparsed_uri
    String.  The URI without any parsing performed. This is the
    argument passed to, e.g., the HTTP GET method, and so is
    completely dependent on the value submitted by the client; you
    have been warned.  Clients typically send a partial uri containing
    only the path and query with no hostinfo, e.g.:
    "GET /path/to/handler?query=value HTTP/1.1".  (Read-Only)

parsed_uri
    Tuple. The value of unparsed_uri broken down into pieces. (scheme,
    hostinfo, user, password, hostname, port, path, query,
    fragment). The apache module defines a set of URI_* constants that
    should be used to access elements of this tuple. Example:

        fname = req.parsed_uri[apache.URI_PATH]

    Please note: as stated for unparsed_uri, the value is completely
    dependent on the uri submitted by the client.  Since it is typical
    for clients to only submit the path and query components the rest
    of the elements in the tuple will often be None.  This is not a
    bug.  (Read-Only)

args
    String. Same as parsed_uri[apache.URI_QUERY] (and CGI
    QUERY_ARGS). (Read-Only)

uri
    String.  The path portion of the URI. Same as
    parsed_uri[apache.URI_PATH].  (Read-Only)

hostname
    String. Host, as set by a full URI from, e.g., the HTTP GET
    method, or in absence of a full URI, the value of the Host header.
    In either case, the value is provided by the client; you have been
    warned.  Note: when set by the Host header (which is typical) this
    value will differ from parsed_uri[apache.URI_HOSTNAME] (which will
    be None).  See unparsed_uri and parsed_uri.  Also, in rare cases
    (no full URI, no Host header) this value can be None.  (Read-Only)

+1 on your definitions, but I have another issue, related to this thread...

This discussion leads me to believe that req.hostname, in its current implementation, is hopelessly ambiguous. It is already doing what we've concluded in this thread to be a Bad Thing(TM) by automagically interposing two completely unrelated values simply to avoid returning None.

Can anyone conceive of a use case where it would be alright to rely on this value, even when it's been arbitrarily populated by a client-supplied absoluteURI (via a proxy, for example)? What would a developer expect to be contained in this value? For myself, I would prefer it to be a high-level interface to req.headers_in['Host'], in which case, None would be somewhat meaningful.

Even better, deprecate req.hostname in 3.2, where we can add req.host to contain the value in req.headers_in['Host']. Then drop req.hostname in 3.3 completely. This will give developers some time to adapt.

Finally, I'm getting the impression that most developers are looking for a portable way to get the ServerName, as defined in the Apache configuration. This may currently be achieved in a variety of ways, including:

    servername = req.server.server_hostname

or:

    req.add_common_vars()
    servername = req.subprocess_env['SERVER_NAME']

So, getting back to Nicolas' original post, and reaffirming Grisha's point that req.hostname isn't appropriate in his script, maybe req.server.server_hostname will work, in that it allows one to construct an URL that gets the user back to the site, even if it doesn't exactly match the URL displayed in the browser during the original request.

Does the fact that this is a difficult discovery warrant the addition of another high-level attribute, req.servername?



Reply via email to