Dear all,
as it looks, edge is more picky about the encoding of URLs in the
location: header field (see e.g. recent entry in the OpenACS issue
tracker [1]). RFC 7231 states [2] that
Location = URI-reference
but as well:
Note: Some recipients attempt to recover from Location fields that
are not valid URI references. This specification does not mandate
or define such processing, but does allow it for the sake of robustness.
The BNF in [3] clear, that it has to be encoded (see snippet for path
segments)
URI-reference = URI / relative-ref
relative-ref = relative-part [ "?" query ] [ "#" fragment ]
relative-part = "//" authority path-abempty
/ path-absolute
/ path-noscheme
/ path-empty
path-abempty = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty = 0<pchar>
segment = *pchar
segment-nz = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
Naviserver passes the URL as is from e.g. a ns_returnredirect to the
"Location:" field.
So the question is, should ns
a) take care about this encoding b) take care about this encoding via
optional flag c) do nothing and leave the responsibility to the
application programmer (current situation) d) provide a warning when an
"obviously" unencoded url is passed to ns_returnredirect
I think, (a) is not useful, since ns can't decide from the string,
whether a "/" in the part is e.g. a delimiter or part of the segment.
Furthermore, it would break existing programs that encode already the
urls correctly. (b) might be useful in simple cases.
I am inclined towards (d), although an exact check for every char which
should have been escaped might be to costly on some characters (checking
if "%" was used just as an escape indicator, etc.); however, an
application developer can get hints via (d), where the url-encoding was
probably lacking.
While looking at the nsd/urlencode.c i saw that the encoding is more
conservative than commented (.... "All ASCII control characters (00-1f
and 7f) and the URI 'delim' and 'unwise' characters are encoded" ...),
but it encodes as well the characters from 0x80 to 0xff. Do I interprete
this correctly, that this refers to the differences/confusions between
RFC1738 (1994) and RFC1808 (1995) vs. RFC2396 (1998), see [5]. The code
says, it conforms with RFC1738, so probably an update to at least
RFC2396 seems appropriate.
Comments?
-g
[1] http://openacs.org/bugtracker/openacs/bug?bug_number=3312 [2]
https://tools.ietf.org/html/rfc7231#page-68 [3]
https://tools.ietf.org/html/rfc3986#appendix-A [4]
https://tools.ietf.org/html/rfc2396 [5]
https://tools.ietf.org/html/rfc2396#appendix-G.2
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel