I'll top post my "solution"; scare quoted because I'm still not sure this is the smartest idea: environ['wsgiorg.path-segments'] = ['catalog', 'NEC', 'Computers', 'Laptop', 'LN500/9DW']
Robert Brewer wrote: > All HTTP URI are /-delimited, and any '/' appearing in a single segment > that is not intended to participate in the hierarchy semantics must be > %-encoded before transmitting it over HTTP. I wholeheartedly agree. And your explanation is clearer than mine. >> IMHO [changing CP's wsgiserver to do decoding] is the wrong answer > Why? > Because then I'm stuck monkey patching every WSGI server (and/or stuck using my own URL dispatcher) so that I don't lose the information that one of the forward slashes is NOT a path delimiter. You said that %-encoding is meant for slashes not participating in hierarchy semantics, if I read you correctly; so I think you'll agree with me on this. > You have to explain why you think the application should receive %XX encoded > URI's instead of decoded ones. What's the benefit? I only see a con: > every piece of middleware that cares has to repeat the decoding of > PATH_INFO and SCRIPT_NAME, wasting CPU and memory. > I was aware of this trade off, which is why I'm still not sure the application should receive the %-encoded URIs. My app was forced to split the URL on the '/' delimiters. If I can get the framework to do that job while dispatching, so much the better. Hence the solution I top posted. My problem rises when I output a link created from suitably %-encoding these path segments: '/'.join(['NEC', 'Computers', 'Laptop', 'LN500/9DW']) And after the user clicks that link, the framework gives me (and Routes has no way to avoid this when Paste is the one who's doing the whole path decoding): ['NEC', 'Computers', 'Laptop', 'LN500', '9DW'] Think dispatching to a ``callable(*segments, **urlvariables)``. I think we'll agree this is not what the app writer intended. And I'm out of luck if the WSGI server/dispatcher is the one doing the urldecoding. > According to [1], the right answer is "yes": > I'll see your CGI draft and raise you the URI spec[2]. When you've read the last sentence, you'll see how unoriginal the top posted solution was: > 2.4.2. When to Escape and Unescape > > A URI is always in an "escaped" form, since escaping or unescaping a > completed URI might change its semantics. Normally, the only time > escape encodings can safely be made is when the URI is being created > from its component parts; each component may have its own set of > characters that are reserved, so only the mechanism responsible for > generating or interpreting that component can determine whether or > not escaping a character will change its semantics. Likewise, a URI > must be separated into its components before the escaped characters > within those components can be safely decoded. [1] http://cgi-spec.golux.com/draft-coar-cgi-v11-03-clean.html#6.1.6 [2] <URL:http://www.ietf.org/rfc/rfc2396.txt>. There is a CGI Informational RFC somewhere, which I've read diagonally coming here to grumble. -- Luís Bruno _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com