Dear all,

as it looks, edge is more picky about the encoding of URLs in the location: header field (see e.g. recent entry in the OpenACS issue tracker [1]). RFC 7231 states [2] that

     Location = URI-reference

but as well:

      Note: Some recipients attempt to recover from Location fields that
      are not valid URI references.  This specification does not mandate
      or define such processing, but does allow it for the sake of robustness.

The BNF in [3] clear, that it has to be encoded (see snippet for path segments)

      URI-reference = URI / relative-ref
      relative-ref  = relative-part [ "?" query ] [ "#" fragment ]
      relative-part = "//" authority path-abempty
                    / path-absolute
                    / path-noscheme
                    / path-empty

      path-abempty  = *( "/" segment )
      path-absolute = "/" [ segment-nz *( "/" segment ) ]
      path-noscheme = segment-nz-nc *( "/" segment )
      path-rootless = segment-nz *( "/" segment )
      path-empty    = 0<pchar>


      segment       = *pchar
      segment-nz    = 1*pchar
      segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
                    ; non-zero-length segment without any colon ":"
      pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"


Naviserver passes the URL as is from e.g. a ns_returnredirect to the "Location:" field.

So the question is, should ns

a) take care about this encoding b) take care about this encoding via optional flag c) do nothing and leave the responsibility to the application programmer (current situation) d) provide a warning when an "obviously" unencoded url is passed to ns_returnredirect

I think, (a) is not useful, since ns can't decide from the string, whether a "/" in the part is e.g. a delimiter or part of the segment. Furthermore, it would break existing programs that encode already the urls correctly. (b) might be useful in simple cases.

I am inclined towards (d), although an exact check for every char which should have been escaped might be to costly on some characters (checking if "%" was used just as an escape indicator, etc.); however, an application developer can get hints via (d), where the url-encoding was probably lacking.

While looking at the nsd/urlencode.c i saw that the encoding is more conservative than commented (.... "All ASCII control characters (00-1f and 7f) and the URI 'delim' and 'unwise' characters are encoded" ...), but it encodes as well the characters from 0x80 to 0xff. Do I interprete this correctly, that this refers to the differences/confusions between RFC1738 (1994) and RFC1808 (1995) vs. RFC2396 (1998), see [5]. The code says, it conforms with RFC1738, so probably an update to at least RFC2396 seems appropriate.

Comments?

-g

[1] http://openacs.org/bugtracker/openacs/bug?bug_number=3312 [2] https://tools.ietf.org/html/rfc7231#page-68 [3] https://tools.ietf.org/html/rfc3986#appendix-A [4] https://tools.ietf.org/html/rfc2396 [5] https://tools.ietf.org/html/rfc2396#appendix-G.2

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to