Re: [org.apache.sling.xss] namespace mangling

2019-11-25 Thread Julian Sedding
IIRC there was at least one other reason for namespace mangling: to
support a filesystem based caching proxy where URLs are mapped to FS
paths. AFAIK windows doesn't allow the colon character in file or
folder names.

Whether that's an architecturally sound implementation choice is of
course another story. I would argue that it would ideally be the
caching proxy that should handle mapping of URLs to valid FS paths.

FWIW, I agree with removing this functionality from XSS, because that
certainly is not the right place for this logic.

Regards
Julian

On Wed, Nov 20, 2019 at 2:27 PM Daniel Klco  wrote:
>
> Makes sense to me! I definitely agree this is unexpected behavior and given
> current browser support the risk is low.
>
> On Tue, Nov 19, 2019 at 12:03 PM Radu Cotescu  wrote:
>
> > Hi Dan,
> >
> > > On 19 Nov 2019, at 16:18, Daniel Klco  wrote:
> > >
> > > I've seen issues with this in the wild. A client was attempting to link
> > to
> > > external URLs containing colons (bad practice I know, but you get health
> > > care web services to get out of the 1990's) in a HTL script which was
> > > getting mangled even though the URL was not a JCR path.
> >
> > Cough… I might have seen this problem last week… Cough...
> >
> > >
> > > My concern is that if this gets removed could the converse become a
> > > problem?
> >
> > Anything can happen, but it doesn’t mean that the API should work like one
> > of its implementations and the API definitely doesn’t mention anything
> > about JCR namespaces.
> >
> > > How would implementers know when to mangle the paths correctly
> >
> > If you refer to developers trying to expose Resource paths as URLs, then
> > they should ALWAYS use ResourceResolver#map, since this takes care of not
> > only namespace mangling, but also of the mapping configurations. I
> > obviously cannot be the one throwing the stone, since I could also be
> > guilty of not using the correct mechanism to expose a URL for a Resource,
> > but the API has been there for ages (the very beginning of Sling, so 10
> > years?!).
> >
> >
> > > and
> > > does this place the burden on the end developer to support this and thus
> > > potentially lead to errors in the implementation?
> > >
> >
> > I might be too young to have the context why this was needed, but I
> > suspect it’s something related to some user agents not being able to cope
> > with “:” in the path segment of the URI, although the latest RFC [3]
> > definitely allows this and the browsers we use nowadays also have no issues
> > any more. At this point I’d strongly argue that it’s not needed any more to
> > do any kind of mangling.
> >
> > Regards,
> > Radu
> >
> > [3] - https://tools.ietf.org/html/rfc3986 <
> > https://tools.ietf.org/html/rfc3986>


Re: [org.apache.sling.xss] namespace mangling

2019-11-20 Thread Daniel Klco
Makes sense to me! I definitely agree this is unexpected behavior and given
current browser support the risk is low.

On Tue, Nov 19, 2019 at 12:03 PM Radu Cotescu  wrote:

> Hi Dan,
>
> > On 19 Nov 2019, at 16:18, Daniel Klco  wrote:
> >
> > I've seen issues with this in the wild. A client was attempting to link
> to
> > external URLs containing colons (bad practice I know, but you get health
> > care web services to get out of the 1990's) in a HTL script which was
> > getting mangled even though the URL was not a JCR path.
>
> Cough… I might have seen this problem last week… Cough...
>
> >
> > My concern is that if this gets removed could the converse become a
> > problem?
>
> Anything can happen, but it doesn’t mean that the API should work like one
> of its implementations and the API definitely doesn’t mention anything
> about JCR namespaces.
>
> > How would implementers know when to mangle the paths correctly
>
> If you refer to developers trying to expose Resource paths as URLs, then
> they should ALWAYS use ResourceResolver#map, since this takes care of not
> only namespace mangling, but also of the mapping configurations. I
> obviously cannot be the one throwing the stone, since I could also be
> guilty of not using the correct mechanism to expose a URL for a Resource,
> but the API has been there for ages (the very beginning of Sling, so 10
> years?!).
>
>
> > and
> > does this place the burden on the end developer to support this and thus
> > potentially lead to errors in the implementation?
> >
>
> I might be too young to have the context why this was needed, but I
> suspect it’s something related to some user agents not being able to cope
> with “:” in the path segment of the URI, although the latest RFC [3]
> definitely allows this and the browsers we use nowadays also have no issues
> any more. At this point I’d strongly argue that it’s not needed any more to
> do any kind of mangling.
>
> Regards,
> Radu
>
> [3] - https://tools.ietf.org/html/rfc3986 <
> https://tools.ietf.org/html/rfc3986>


Re: [org.apache.sling.xss] namespace mangling

2019-11-19 Thread Radu Cotescu
Hi Dan,

> On 19 Nov 2019, at 16:18, Daniel Klco  wrote:
> 
> I've seen issues with this in the wild. A client was attempting to link to
> external URLs containing colons (bad practice I know, but you get health
> care web services to get out of the 1990's) in a HTL script which was
> getting mangled even though the URL was not a JCR path.

Cough… I might have seen this problem last week… Cough...

> 
> My concern is that if this gets removed could the converse become a
> problem?

Anything can happen, but it doesn’t mean that the API should work like one of 
its implementations and the API definitely doesn’t mention anything about JCR 
namespaces.

> How would implementers know when to mangle the paths correctly

If you refer to developers trying to expose Resource paths as URLs, then they 
should ALWAYS use ResourceResolver#map, since this takes care of not only 
namespace mangling, but also of the mapping configurations. I obviously cannot 
be the one throwing the stone, since I could also be guilty of not using the 
correct mechanism to expose a URL for a Resource, but the API has been there 
for ages (the very beginning of Sling, so 10 years?!).


> and
> does this place the burden on the end developer to support this and thus
> potentially lead to errors in the implementation?
> 

I might be too young to have the context why this was needed, but I suspect 
it’s something related to some user agents not being able to cope with “:” in 
the path segment of the URI, although the latest RFC [3] definitely allows this 
and the browsers we use nowadays also have no issues any more. At this point 
I’d strongly argue that it’s not needed any more to do any kind of mangling.

Regards,
Radu

[3] - https://tools.ietf.org/html/rfc3986 

RE: [org.apache.sling.xss] namespace mangling

2019-11-19 Thread Stefan Seifert
in my understanding the namespace mangling was only introduced in the olden 
days of sling to work around problems in some old browsers that did not support 
URLs with colons in it. i think those old browsers are no longer in use for 
many, many years. so i assume it is no problem to not mangle the URLs nowadays, 
and +1 to remove the mangling from the XSS handling.

stefan

>-Original Message-
>From: Radu Cotescu [mailto:r...@apache.org]
>Sent: Tuesday, November 19, 2019 4:02 PM
>To: Sling Dev
>Subject: [org.apache.sling.xss] namespace mangling
>
>Hi,
>
>From the very beginning the org.apache.sling.xss code was donated to Sling
>it provided an implementation of the XSSAPI.getValidHref that mangles JCR
>namespaces from the passed URLs (let’s not comment on the naming). However,
>the code that does this has no information about the registered namespaces
>that one can see when accessing the "/system/console/status-
>JCR%20Namespaces” console and, instead, works with patterns. Brittle, I
>know.
>
>Now, if we check the ResourceResolver API, specifically the
>org.apache.sling.api.resource.ResourceResolver#map(java.lang.String) method
>[0], we see that namespace mangling should be performed here [1].
>
>In my opinion we should completely remove the mangling functionality from
>the XSS implementation, since it’s the caller’s responsibility to provide a
>correct request path. We cannot assume all URLs passed to the
>XSSAPI.getValidHref are JCR paths and I wouldn’t like to add more context
>in the implementation.
>
>Are there different opinions? I’d like to consult the dev list before
>opening an issue and removing the code in question [2].
>
>Thanks,
>Radu
>
>
>[0] - https://github.com/apache/sling-org-apache-sling-
>api/blob/11bf3603155af21201b0fced2c6968d2223254b9/src/main/java/org/apache/
>sling/api/resource/ResourceResolver.java#L294
>api/blob/11bf3603155af21201b0fced2c6968d2223254b9/src/main/java/org/apache/
>sling/api/resource/ResourceResolver.java#L294>
>[1] - https://sling.apache.org/documentation/the-sling-engine/mappings-for-
>resource-resolution.html#namespace-mangling
>resource-resolution.html#namespace-mangling>
>[2] - https://github.com/apache/sling-org-apache-sling-
>xss/blob/8ec9cf33080fbbb70dc6a51dea92533946295db8/src/main/java/org/apache/
>sling/xss/impl/XSSAPIImpl.java#L194 apache-sling-
>xss/blob/master/src/main/java/org/apache/sling/xss/impl/XSSAPIImpl.java#L19
>4>


Re: [org.apache.sling.xss] namespace mangling

2019-11-19 Thread Daniel Klco
I've seen issues with this in the wild. A client was attempting to link to
external URLs containing colons (bad practice I know, but you get health
care web services to get out of the 1990's) in a HTL script which was
getting mangled even though the URL was not a JCR path.

My concern is that if this gets removed could the converse become a
problem? How would implementers know when to mangle the paths correctly and
does this place the burden on the end developer to support this and thus
potentially lead to errors in the implementation?

Regards,
Dan

On Tue, Nov 19, 2019 at 10:01 AM Radu Cotescu  wrote:

> Hi,
>
> From the very beginning the org.apache.sling.xss code was donated to Sling
> it provided an implementation of the XSSAPI.getValidHref that mangles JCR
> namespaces from the passed URLs (let’s not comment on the naming). However,
> the code that does this has no information about the registered namespaces
> that one can see when accessing the
> "/system/console/status-JCR%20Namespaces” console and, instead, works with
> patterns. Brittle, I know.
>
> Now, if we check the ResourceResolver API, specifically the
> org.apache.sling.api.resource.ResourceResolver#map(java.lang.String) method
> [0], we see that namespace mangling should be performed here [1].
>
> In my opinion we should completely remove the mangling functionality from
> the XSS implementation, since it’s the caller’s responsibility to provide a
> correct request path. We cannot assume all URLs passed to the
> XSSAPI.getValidHref are JCR paths and I wouldn’t like to add more context
> in the implementation.
>
> Are there different opinions? I’d like to consult the dev list before
> opening an issue and removing the code in question [2].
>
> Thanks,
> Radu
>
>
> [0] -
> https://github.com/apache/sling-org-apache-sling-api/blob/11bf3603155af21201b0fced2c6968d2223254b9/src/main/java/org/apache/sling/api/resource/ResourceResolver.java#L294
> <
> https://github.com/apache/sling-org-apache-sling-api/blob/11bf3603155af21201b0fced2c6968d2223254b9/src/main/java/org/apache/sling/api/resource/ResourceResolver.java#L294
> >
> [1] -
> https://sling.apache.org/documentation/the-sling-engine/mappings-for-resource-resolution.html#namespace-mangling
> <
> https://sling.apache.org/documentation/the-sling-engine/mappings-for-resource-resolution.html#namespace-mangling
> >
> [2] -
> https://github.com/apache/sling-org-apache-sling-xss/blob/8ec9cf33080fbbb70dc6a51dea92533946295db8/src/main/java/org/apache/sling/xss/impl/XSSAPIImpl.java#L194
> <
> https://github.com/apache/sling-org-apache-sling-xss/blob/master/src/main/java/org/apache/sling/xss/impl/XSSAPIImpl.java#L194
> >