David Maus <dm...@ictsoc.de> writes: > The more I think about it the more I grow certain that it is NOT about > URI encoding but protecting a string.
This is what I mean. > `[' and `]' are not forbidden per se, they belong to the set of > reserved characters (see RFC 3986, 2.2.). > > "characters in the reserved set are protected from normalization and > are therefore safe to be used by scheme-specific and producer-specific > algorithms for delimiting data subcomponents within a URI." > (RFC 3986, p. 12) > > Moreover they are explicitly required in the host part to denote a > IPv6 address literal (RFC 3986, 3.2.2). > > If I am not mistaken then this is a valid http-URI with a XPointer > fragment pointing to the third `p' element in a locally hosted file: > > http://[::1]/foo.xml#xpointer(//p[3]) Thanks for the info. I didn't read RFC 3986 thoroughly. > If we escape but don't unescape there are *other* problems: Depending > on the protocol an escaped square bracket and a unescaped square > bracket can have different meaning. The assumption I mentioned referes > to unescaped characters. A consuming application knows the protocol > and can infer the characters that need to be escaped. We cannot unescape if we use %-encoding, as stated before. > ACK. It's not about creating URIs but protecting strings, thus the > rules for percent escaping don't have to be applied. Indeed. Ideally, we need to encode "[" and "]" with strings that cannot ever be found in a URI. Then, it will be possible to decode them safely. Regards, -- Nicolas Goaziou