David Maus <dm...@ictsoc.de> writes:

> The more I think about it the more I grow certain that it is NOT about
> URI encoding but protecting a string.

This is what I mean.

> `[' and `]' are not forbidden per se, they belong to the set of
> reserved characters (see RFC 3986, 2.2.).
>
> "characters in the reserved set are protected from normalization and
> are therefore safe to be used by scheme-specific and producer-specific
> algorithms for delimiting data subcomponents within a URI."
> (RFC 3986, p. 12)
>
> Moreover they are explicitly required in the host part to denote a
> IPv6 address literal (RFC 3986, 3.2.2).
>
> If I am not mistaken then this is a valid http-URI with a XPointer
> fragment pointing to the third `p' element in a locally hosted file:
>
> http://[::1]/foo.xml#xpointer(//p[3])

Thanks for the info. I didn't read RFC 3986 thoroughly.

> If we escape but don't unescape there are *other* problems: Depending
> on the protocol an escaped square bracket and a unescaped square
> bracket can have different meaning. The assumption I mentioned referes
> to unescaped characters. A consuming application knows the protocol
> and can infer the characters that need to be escaped.

We cannot unescape if we use %-encoding, as stated before.

> ACK. It's not about creating URIs but protecting strings, thus the
> rules for percent escaping don't have to be applied.

Indeed. Ideally, we need to encode "[" and "]" with strings that cannot
ever be found in a URI. Then, it will be possible to decode them safely.


Regards,

-- 
Nicolas Goaziou

Reply via email to