https://issues.apache.org/bugzilla/show_bug.cgi?id=57215

--- Comment #3 from Konstantin Kolinko <knst.koli...@gmail.com> ---
(In reply to Mark Thomas from comment #1)
> For example, what gets returned for a request to
> "%2Ffoo%2F%2E%2E%2Fpath"?

RFC7230 2.7.3. "http and https URI Normalization and Comparison" says about
http and https URIs:

   ...
   such URIs are normalized and compared according to the
   algorithm defined in Section 6 of [RFC3986]


http://tools.ietf.org/html/rfc3986
RFC3986 2.3. Unreserved Characters [1]

   Characters that are allowed in a URI but do not have a reserved
   purpose are called unreserved.  These include uppercase and lowercase
   letters, decimal digits, hyphen, period, underscore, and tilde.

      unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

   URIs that differ in the replacement of an unreserved character with
   its corresponding percent-encoded US-ASCII octet are equivalent: they
   identify the same resource.  However, URI comparison implementations
   do not always perform normalization prior to comparison (see Section
   6).  For consistency, percent-encoded octets in the ranges of ALPHA
   (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
   underscore (%5F), or tilde (%7E) should not be created by URI
   producers and, when found in a URI, should be decoded to their
   corresponding unreserved characters by URI normalizers.

RFC3986 6.2.2.2. Percent-Encoding Normalization

   The percent-encoding mechanism (Section 2.1) is a frequent source of
   variance among otherwise identical URIs.  In addition to the case
   normalization issue noted above, some URI producers percent-encode
   octets that do not require percent-encoding, resulting in URIs that
   are equivalent to their non-encoded counterparts.  These URIs should
   be normalized by decoding any percent-encoded octet that corresponds
   to an unreserved character, as described in Section 2.3.


So it looks that RFC3986 says to url-decode the above listed "unreserved"
characters before performing normalization, but only them.

"%2Ffoo%2F%2E%2E%2Fpath" becomes "%2Ffoo%2F..%2Fpath" but nothing more as %2F
is not decoded.

In regards to r1640083 the "canonicalContextPath.equals(candidate)" comparison
looks fragile.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to